Abstract We introduceMahakala, aPython-based, modular, radiative ray-tracing code for curved spacetimes. We employ Google’sJAXframework for accelerated automatic differentiation, which can efficiently compute Christoffel symbols directly from the metric, allowing the user to easily and quickly simulate photon trajectories through non-Kerr spacetimes.JAXalso enablesMahakalato run in parallel on both CPUs and GPUs.Mahakalanatively uses the Cartesian Kerr–Schild coordinate system, which avoids numerical issues caused by the pole in spherical coordinate systems. We demonstrateMahakala’s capabilities by simulating 1.3 mm wavelength images (the wavelength of Event Horizon Telescope observations) of general relativistic magnetohydrodynamic simulations of low-accretion rate supermassive black holes. The modular nature ofMahakalaallows us to quantitatively explore how different regions of the flow influence different image features. We show that most of the emission seen in 1.3 mm images originates close to the black hole and peaks near the photon orbit. We also quantify the relative contribution of the disk, forward jet, and counterjet to 1.3 mm images. 
                        more » 
                        « less   
                    
                            
                            LLMEffiChecker : Understanding and Testing Efficiency Degradation of Large Language Models
                        
                    
    
            Large Language Models (LLMs) have received much recent attention due to their human-level accuracy. While existing works mostly focus on either improving accuracy or testing accuracy robustness, the computation efficiency of LLMs, which is of paramount importance due to often vast generation demands and real-time requirements, has surprisingly received little attention. In this article, we make the first attempt to understand and test potential computation efficiency robustness in state-of-the-art LLMs. By analyzing the working mechanism and implementation of 20,543 public-accessible LLMs, we observe a fundamental property in LLMs that could be manipulated in an adversarial manner to reduce computation efficiency significantly. Our interesting observation is that the output length determines the computation efficiency of LLMs instead of the input, where the output length depends on two factors: an often sufficiently large yet pessimistic pre-configured threshold controlling the max number of iterations and a runtime-generated end of sentence (EOS) token. Our key motivation is to generate test inputs that could sufficiently delay the generation of EOS such that LLMs would have to go through enough iterations to satisfy the pre-configured threshold. We presentLLMEffiChecker, which can work under both white-box setting and black-box setting. In the white-box scenario,LLMEffiCheckerdevelops a gradient-guided technique that searches for a minimal and unnoticeable perturbation at character-level, token-level, and structure-level. In the black-box scenario,LLMEffiCheckeremploys a causal inference-based approach to find critical tokens and similarly applies three levels of imperceptible perturbation to them. Both the white-box and black-box settings effectively delay the appearance of EOS, compelling these inputs to reach the naturally unreachable threshold. To demonstrate the effectiveness ofLLMEffiChecker, we conduct a systematic evaluation on nine publicly available LLMs: Google T5, AllenAI WMT14, Helsinki-NLP translator, Facebook FairSeq, UNICAMP-DL translator, MarianMT, Google FLAN-T5, MBZUAI LaMini-GPT, and Salesforce CodeGen. Experimental results show thatLLMEffiCheckercan increase on average LLMs’ response latency and energy consumption by 325% to 3,244% and 344% to 3,616%, respectively, by perturbing just one character or token in the input sentence. Our case study shows that inputs generated byLLMEffiCheckersignificantly affect the battery power in real-world mobile devices (i.e., drain more than 30 times battery power than normal inputs). 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2146443
- PAR ID:
- 10592636
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- ACM Transactions on Software Engineering and Methodology
- Volume:
- 33
- Issue:
- 7
- ISSN:
- 1049-331X
- Page Range / eLocation ID:
- 1 to 38
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract A new image-reconstruction algorithm, Principal-component Interferometric Modeling (PRIMO), applied to the interferometric data of the M87 black hole collected with the Event Horizon Telescope (EHT), resulted in an image that reached the native resolution of the telescope array.PRIMOis based on learning a compact set of image building blocks obtained from a large library of high-fidelity, physics-based simulations of black hole images. It uses these building blocks to fill the sparse Fourier coverage of the data that results from the small number of telescopes in the array. In this paper, we show that this approach is readily justified. Since the angular extent of the image of the black hole and of its inner accretion flow is finite, the Fourier space domain is heavily smoothed, with a correlation scale that is at most comparable to the sizes of the data gaps in the coverage of Fourier space with the EHT. Consequently,PRIMOor other machine learning algorithms can faithfully reconstruct the images without the need to generate information that is unconstrained by the data within the resolution of the array. We also address the completeness of the eigenimages and the compactness of the resulting representation. We show thatPRIMOprovides a compact set of eigenimages that have sufficient complexity to recreate a broad set of images well beyond those in the training set.more » « less
- 
            Large language models have gained significant popularity and are often provided as a service (i.e., LLMaaS). Companies like OpenAI and Google provide online APIs of LLMs to allow downstream users to create innovative applications. Despite its popularity, LLM safety and quality assurance is a well-recognized concern in the real world, requiring extra efforts for testing these LLMs. Unfortunately, while end-to-end services like ChatGPT have garnered rising attention in terms of testing, the LLMaaS embeddings have comparatively received less scrutiny. We state the importance of testing and uncovering problematic individual embeddings without considering downstream applications. The abstraction and non-interpretability of embedded vectors, combined with the black-box inaccessibility of LLMaaS, make testing a challenging puzzle. This paper proposes COSTELLO, a black-box approach to reveal potential defects in abstract embedding vectors from LLMaaS bycontrastive testing. Our intuition is that high-quality LLMs can adequately capture the semantic relationships of the input texts and properly represent their relationships in the high-dimensional space. For the given interface of LLMaaS and seed inputs, COSTELLO can automatically generate test suites and output words with potential problematic embeddings. The idea is to synthesize contrastive samples with guidance, including positive and negative samples, by mutating seed inputs. Our synthesis guide will leverage task-specific properties to control the mutation procedure and generate samples with known partial relationships in the high-dimensional space. Thus, we can compare the expected relationship (oracle) and embedding distance (output of LLMs) to locate potential buggy cases. We evaluate COSTELLO on 42 open-source (encoder-based) language models and two real-world commercial LLMaaS. Experimental results show that COSTELLO can effectively detect semantic violations, where more than 62% of violations on average result in erroneous behaviors (e.g., unfairness) of downstream applications.more » « less
- 
            We propose an intriguingly simple method for the construction of adversarial images in the black-box setting. In constrast to the white-box scenario, constructing black-box adversarial im- ages has the additional constraint on query bud- get, and efficient attacks remain an open prob- lem to date. With only the mild assumption of continuous-valued confidence scores, our highly query-efficient algorithm utilizes the following simple iterative principle: we randomly sample a vector from a predefined orthonormal basis and either add or subtract it to the target image. De- spite its simplicity, the proposed method can be used for both untargeted and targeted attacks – resulting in previously unprecedented query effi- ciency in both settings. We demonstrate the effi- cacy and efficiency of our algorithm on several real world settings including the Google Cloud Vision API. We argue that our proposed algorithm should serve as a strong baseline for future black- box attacks, in particular because it is extremely fast and its implementation requires less than 20 lines of PyTorch code.more » « less
- 
            Accurate citywide crowd activity prediction (CAP) can enable proactive crowd mobility management and timely responses to urban events, which has become increasingly important for a myriad of smart city planning and management purposes. However, complex correlations across the crowd activities, spatial and temporal urban environment features and theirinteractivedependencies, and relevant external factors (e.g., weather conditions) make it highly challenging to predict crowd activities accurately in terms of different venue categories (for instance, venues related to dining, services, and residence) and varying degrees (e.g., daytime and nighttime). To address the above concerns, we proposeSTICAP, a citywide spatio-temporal interactive crowd activity prediction approach. In particular,STICAPtakes in the location-based social network check-in data (e.g., from Foursquare/Gowalla) as the model inputs and forecasts the crowd activity within each time step for each venue category. Furthermore, we have integrated multiple levels of temporal discretization to interactively capture the relations with historical data. Then, three parallelResidual Spatial Attention Networks(RSAN) in theSpatial Attention Componentexploit the hourly, daily, and weekly spatial features of crowd activities, which are further fused and processed by theTemporal Attention Componentforinteractive CAP. Along with other external factors such as weather conditions and holidays,STICAPadaptively and accurately forecasts the final crowd activities per venue category, enabling potential activity recommendation and other smart city applications. Extensive experimental studies based on three different real-world crowd activity datasets have demonstrated that our proposedSTICAPoutperforms the baseline and state-of-the-art algorithms in CAP accuracy, with an average error reduction of 35.02%.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    