skip to main content


Title: A Note on Exploratory Item Factor Analysis by Singular Value Decomposition
Abstract We revisit a singular value decomposition (SVD) algorithm given in Chen et al. (Psychometrika 84:124–146, 2019b) for exploratory item factor analysis (IFA). This algorithm estimates a multidimensional IFA model by SVD and was used to obtain a starting point for joint maximum likelihood estimation in Chen et al. (2019b). Thanks to the analytic and computational properties of SVD, this algorithm guarantees a unique solution and has computational advantage over other exploratory IFA methods. Its computational advantage becomes significant when the numbers of respondents, items, and factors are all large. This algorithm can be viewed as a generalization of principal component analysis to binary data. In this note, we provide the statistical underpinning of the algorithm. In particular, we show its statistical consistency under the same double asymptotic setting as in Chen et al. (2019b). We also demonstrate how this algorithm provides a scree plot for investigating the number of factors and provide its asymptotic theory. Further extensions of the algorithm are discussed. Finally, simulation studies suggest that the algorithm has good finite sample performance.  more » « less
Award ID(s):
1712657
NSF-PAR ID:
10298398
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Psychometrika
Volume:
85
Issue:
2
ISSN:
0033-3123
Page Range / eLocation ID:
358 to 372
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Crowdsourcing has rapidly become a computing paradigm in machine learning and artificial intelligence. In crowdsourcing, multiple labels are collected from crowd-workers on an instance usually through the Internet. These labels are then aggregated as a single label to match the ground truth of the instance. Due to its open nature, human workers in crowdsourcing usually come with various levels of knowledge and socio-economic backgrounds. Effectively handling such human factors has been a focus in the study and applications of crowdsourcing. For example, Bi et al studied the impacts of worker's dedication, expertise, judgment, and task difficulty (Bi et al 2014). Qiu et al offered methods for selecting workers based on behavior prediction (Qiu et al 2016). Barbosa and Chen suggested rehumanizing crowdsourcing to deal with human biases (Barbosa 2019). Checco et al studied adversarial attacks on crowdsourcing for quality control (Checco et al 2020). There are many more related works available in literature. In contrast to commonly used binary-valued labels, interval-valued labels (IVLs) have been introduced very recently (Hu et al 2021). Applying statistical and probabilistic properties of interval-valued datasets, Spurling et al quantitatively defined worker's reliability in four measures: correctness, confidence, stability, and predictability (Spurling et al 2021). Calculating these measures, except correctness, does not require the ground truth of each instance but only worker’s IVLs. Applying these quantified reliability measures, people have significantly improved the overall quality of crowdsourcing (Spurling et al 2022). However, in real world applications, the reliability of a worker may vary from time to time rather than a constant. It is necessary to monitor worker’s reliability dynamically. Because a worker j labels instances sequentially, we treat j’s IVLs as an interval-valued time series in our approach. Assuming j’s reliability relies on the IVLs within a time window only, we calculate j’s reliability measures with the IVLs in the current time window. Moving the time window forward with our proposed practical strategies, we can monitor j’s reliability dynamically. Furthermore, the four reliability measures derived from IVLs are time varying too. With regression analysis, we can separate each reliability measure as an explainable trend and possible errors. To validate our approaches, we use four real world benchmark datasets in our computational experiments. Here are the main findings. The reliability weighted interval majority voting (WIMV) and weighted preferred matching probability (WPMP) schemes consistently overperform the base schemes in terms of much higher accuracy, precision, recall, and F1-score. Note: the base schemes are majority voting (MV), interval majority voting (IMV), and preferred matching probability (PMP). Through monitoring worker’s reliability, our computational experiments have successfully identified possible attackers. By removing identified attackers, we have ensured the quality. We have also examined the impact of window size selection. It is necessary to monitor worker’s reliability dynamically, and our computational results evident the potential success of our approaches.This work is partially supported by the US National Science Foundation through the grant award NSF/OIA-1946391.

     
    more » « less
  2. We introduce a notion of \emph{generic local algorithm} which strictly generalizes existing frameworks of local algorithms such as \emph{factors of i.i.d.} by capturing local \emph{quantum} algorithms such as the Quantum Approximate Optimization Algorithm (QAOA). Motivated by a question of Farhi et al. [arXiv:1910.08187, 2019] we then show limitations of generic local algorithms including QAOA on random instances of constraint satisfaction problems (CSPs). Specifically, we show that any generic local algorithm whose assignment to a vertex depends only on a local neighborhood with o(n) other vertices (such as the QAOA at depth less than ϵlog(n)) cannot arbitrarily-well approximate boolean CSPs if the problem satisfies a geometric property from statistical physics called the coupled overlap-gap property (OGP) [Chen et al., Annals of Probability, 47(3), 2019]. We show that the random MAX-k-XOR problem has this property when k≥4 is even by extending the corresponding result for diluted k-spin glasses. Our concentration lemmas confirm a conjecture of Brandao et al. [arXiv:1812.04170, 2018] asserting that the landscape independence of QAOA extends to logarithmic depth -- in other words, for every fixed choice of QAOA angle parameters, the algorithm at logarithmic depth performs almost equally well on almost all instances. One of these concentration lemmas is a strengthening of McDiarmid's inequality, applicable when the random variables have a highly biased distribution, and may be of independent interest. 
    more » « less
  3. Memories are an important part of how we think, understand the world around us, and plan out future actions. In the brain, memories are thought to be stored in a region called the hippocampus. When memories are formed, neurons store events that occur around the same time together. This might explain why often, in the brains of animals, the activity associated with retrieving memories is not just a snapshot of what happened at a specific moment-- it can also include information about what the animal might experience next. This can have a clear utility if animals use memories to predict what they might experience next and plan out future actions. Mathematically, this notion of predictiveness can be summarized by an algorithm known as the successor representation. This algorithm describes what the activity of neurons in the hippocampus looks like when retrieving memories and making predictions based on them. However, even though the successor representation can computationally reproduce the activity seen in the hippocampus when it is making predictions, it is unclear what biological mechanisms underpin this computation in the brain. Fang et al. approached this problem by trying to build a model that could generate the same activity patterns computed by the successor representation using only biological mechanisms known to exist in the hippocampus. First, they used computational methods to design a network of neurons that had the biological properties of neural networks in the hippocampus. They then used the network to simulate neural activity. The results show that the activity of the network they designed was able to exactly match the successor representation. Additionally, the data resulting from the simulated activity in the network fitted experimental observations of hippocampal activity in Tufted Titmice. One advantage of the network designed by Fang et al. is that it can generate predictions in flexible ways,. That is, it canmake both short and long-term predictions from what an individual is experiencing at the moment. This flexibility means that the network can be used to simulate how the hippocampus learns in a variety of cognitive tasks. Additionally, the network is robust to different conditions. Given that the brain has to be able to store memories in many different situations, this is a promising indication that this network may be a reasonable model of how the brain learns. The results of Fang et al. lay the groundwork for connecting biological mechanisms in the hippocampus at the cellular level to cognitive effects, an essential step to understanding the hippocampus, as well as its role in health and disease. For instance, their network may provide a concrete approach to studying how disruptions to the ways neurons make and break connections can impair memory formation. More generally, better models of the biological mechanisms involved in making computations in the hippocampus can help scientists better understand and test out theories about how memories are formed and stored in the brain. 
    more » « less
  4. All-solid-state batteries (ASSBs) have garnered increasing attention due to the enhanced safety, featuring nonflammable solid electrolytes as well as the potential to achieve high energy density. 1 The advancement of the ASSBs is expected to provide, arguably, the most straightforward path towards practical, high-energy, and rechargeable batteries based on metallic anodes. 1 However, the sluggish ion transmission at the cathode-electrolyte (solid/solid) interface would result in the high resistant at the contact and limit the practical implementation of these all solid-state materials in real world batteries. 2 Several methods were suggested to enhance the kinetic condition of the ion migration between the cathode and the solid electrolyte (SE). 3 A composite strategy that mixes active materials and SEs for the cathode is a general way to decrease the ion transmission barrier at the cathode-electrolyte interface. 3 The active material concentration in the cathode is reduced as much as the SE portion increases by which the energy density of the ASSB is restricted. In addition, the mixing approach generally accompanies lattice mismatches between the cathode active materials and the SE, thus providing only limited improvements, which is imputed by random contacts between the cathode active materials and the SE during the mixing process. Implementing high-pressure for the electrode and electrolyte of ASSB in the assembling process has been verified is a but effective way to boost the ion transmission ability between the cathode active materials and the SE by decreasing the grain boundary impedance. Whereas the short-circuit of the battery would be induced by the mechanical deformation of the electrolyte under high pressure. 4 Herein, we demonstrate a novel way to address the ion transmission problem at the cathode-electrolyte interface in ASSBs. Starting from the cathode configuration, the finite element method (FEM) was employed to evaluate the current concentration and the distribution of the space charge layer at the cathode-electrolyte interface. Hierarchical three-dimensional (HTD) structures are found to have a higher Li + transfer number (t Li+ ), fewer free anions, and the weaker space-charge layer at the cathode-electrolyte interface in the resulting FEM simulation. To take advantage of the HTD structure, stereolithography is adopted as a manufacturing technique and single-crystalline Ni-rich (SCN) materials are selected as the active materials. Next, the manufactured HTD cathode is sintered at 600 °C in an N 2 atmosphere for the carbonization of the resin, which induces sufficient electronic conductivity for the cathode. Then, the gel-like Li 1.4 Al 0.4 Ti 1.6 (PO 4 ) 3 (LATP) precursor is synthesized and filled into the voids of the HTD structure cathode sufficiently. And the filled HTD structure cathodes are sintered at 900 °C to achieve the crystallization of the LATP gel. Scanning transmission electron microscopy (STEM) is used to unveil the morphology of the cathode-electrolyte interface between the sintered HTD cathode and the in-situ generated electrolyte (LATP). A transient phase has been found generated at the interface and matched with both lattices of the SCN and the SE, accelerating the transmission of the Li-ions, which is further verified by density functional theory calculations. In addition, Electron Energy Loss Spectroscopy demonstrates the preserved interface between HTD cathode and SEs. Atomic force microscopy is employed to measure the potential image of the cross-sectional interface by the peak force tapping mode. The average potential of modified samples is lower than the sample that mix SCN and SEs simply in the 2D planar structure, which confirms a weakened space charge layer by the enhanced contact capability as well as the ion transmission ability. To see if the demonstrated method is universally applicable, LiNi 0.8 Co 0.1 Mn 0.1 O 2 (NCM811) is selected as the cathode active material and manufactured in the same way as the SCN. The HTD cathode based on NCM811 exhibits higher electrochemical performance compared with the reference sample based on the 2D planar mixing-type cathode. We believe such a demonstrated universal strategy provides a new guideline to engineer the cathode/electrolyte interface by revolutionizing electrode structures that can be applicable to all-solid-state batteries. Figure 1. Schematic of comparing of traditional 2D planar cathode and HTD cathode in ASSB Tikekar, M. D. , et al. , Nature Energy (2016) 1 (9), 16114 Banerjee, A. , et al. , Chem Rev (2020) 120 (14), 6878 Chen, R. , et al. , Chem Rev (2020) 120 (14), 6820 Cheng, X. , et al. , Advanced Energy Materials (2018) 8 (7) Figure 1 
    more » « less
  5. The ongoing lack of diversity in STEM fields has been described as both: a) a critical issue with a detrimental impact on the United States’ ability to compete with global innovation (Chen, 2013) and b) a systemic issue that excludes certain groups of people from opportunities for economic mobility and job security (Wait & McDonald, 2019). Historically excluded groups, such as women, Black/African Americans, Latin Americans, and economically disadvantaged individuals, continue to be in the minority in STEM (Carnevale et al., 2021). Through the years of research on historically excluded groups, researchers have asserted the importance of developing an engineering identity in determining later success in engineering (Allen & Eisenhart, 2017; Kang et al., 2019; Stipanovic & Woo, 2017). With only 8% of all engineering students entering higher education from low income backgrounds (NCES, 2016; Major et. al, 2018), these students often face significant barriers to their success (Chen, 2013; Hoxby & Avery, 2012), yet there has been very little attention given to them in the research historically. Our study seeks to address the gap related to this population and support the developing understanding of how high achieving, low income students form an engineering identity, as well as the intersectionality and salience of their other socio-cultural identities. Using the theoretical framework of figured worlds (Holland et al., 1998; Waide-James & Schwartz, 2019), we sought to explore what factors shaped the formation of an engineering identity for high achieving, low income college students participating in an engineering scholarship program. Specifically, our research questions were: 1) What factors shape the formation of engineering identity for high achieving, low income students participating in an engineering scholarship program? and 2) How salient are other social identities in the formation of their engineering identity? A constructivist grounded theory (Charmaz, 2014) design guided our selection of individual interviews and focus groups as data collection tools, allowing us to tailor our interview questions and shape our programming around the needs of participants. NSF SSTEM-sponsored program activities that could shape the figured world of participants included intentional mentoring, cohort-based seminars, targeted coursework in design courses, and connecting students to internships and co-ops. Emerging themes for our preliminary data analysis reveal the importance of peer relationships, professional mentorship, and cultural wealth, including social capital. Preliminary results from this study have the potential to increase understanding of how to best support the success of high achieving, low income college students in engineering programs, including the implementation of targeted interventions and supports, as well as shed further light on the skills they use to overcome systemic barriers. 
    more » « less