skip to main content


Title: Linear representations of finite geometries and associated LDPC codes
The linear representation of a subset of a finite projective space is an incidence system of affine points and lines determined by the subset. In this paper we use character theory to show that the rank of the incidence matrix has a direct geometric interpretation in terms of certain hyperplanes. We consider the LDPC codes defined by taking the incidence matrix and its transpose as parity-check matrices, and in the former case prove a conjecture of Vandendriessche that the code is generated by words of minimum weight called plane words. In the latter case we compute the minimum weight in several cases and provide explicit constructions of minimum weight codewords.  more » « less
Award ID(s):
1855723
NSF-PAR ID:
10172605
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Journal of combinatorial theory
Volume:
173
ISSN:
0097-3165
Page Range / eLocation ID:
105238
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract: Jury notetaking can be controversial despite evidence suggesting benefits for recall and understanding. Research on note taking has historically focused on the deliberation process. Yet, little research explores the notes themselves. We developed a 10-item coding guide to explore what jurors take notes on (e.g., simple vs. complex evidence) and how they take notes (e.g., gist vs. specific representation). In general, jurors made gist representations of simple and complex information in their notes. This finding is consistent with Fuzzy Trace Theory (Reyna & Brainerd, 1995) and suggests notes may serve as a general memory aid, rather than verbatim representation. Summary: The practice of jury notetaking in the courtroom is often contested. Some states allow it (e.g., Nebraska: State v. Kipf, 1990), while others forbid it (e.g., Louisiana: La. Code of Crim. Proc., Art. 793). Some argue notes may serve as a memory aid, increase juror confidence during deliberation, and help jurors engage in the trial (Hannaford & Munsterman, 2001; Heuer & Penrod, 1988, 1994). Others argue notetaking may distract jurors from listening to evidence, that juror notes may be given undue weight, and that those who took notes may dictate the deliberation process (Dann, Hans, & Kaye, 2005). While research has evaluated the efficacy of juror notes on evidence comprehension, little work has explored the specific content of juror notes. In a similar project on which we build, Dann, Hans, and Kaye (2005) found jurors took on average 270 words of notes each with 85% including references to jury instructions in their notes. In the present study we use a content analysis approach to examine how jurors take notes about simple and complex evidence. We were particularly interested in how jurors captured gist and specific (verbatim) information in their notes as they have different implications for information recall during deliberation. According to Fuzzy Trace Theory (Reyna & Brainerd, 1995), people extract “gist” or qualitative meaning from information, and also exact, verbatim representations. Although both are important for helping people make well-informed judgments, gist-based understandings are purported to be even more important than verbatim understanding (Reyna, 2008; Reyna & Brainer, 2007). As such, it could be useful to examine how laypeople represent information in their notes during deliberation of evidence. Methods Prior to watching a 45-minute mock bank robbery trial, jurors were given a pen and notepad and instructed they were permitted to take notes. The evidence included testimony from the defendant, witnesses, and expert witnesses from prosecution and defense. Expert testimony described complex mitochondrial DNA (mtDNA) evidence. The present analysis consists of pilot data representing 2,733 lines of notes from 52 randomly-selected jurors across 41 mock juries. Our final sample for presentation at AP-LS will consist of all 391 juror notes in our dataset. Based on previous research exploring jury note taking as well as our specific interest in gist vs. specific encoding of information, we developed a coding guide to quantify juror note-taking behaviors. Four researchers independently coded a subset of notes. Coders achieved acceptable interrater reliability [(Cronbach’s Alpha = .80-.92) on all variables across 20% of cases]. Prior to AP-LS, we will link juror notes with how they discuss scientific and non-scientific evidence during jury deliberation. Coding Note length. Before coding for content, coders counted lines of text. Each notepad line with at minimum one complete word was coded as a line of text. Gist information vs. Specific information. Any line referencing evidence was coded as gist or specific. We coded gist information as information that did not contain any specific details but summarized the meaning of the evidence (e.g., “bad, not many people excluded”). Specific information was coded as such if it contained a verbatim descriptive (e.g.,“<1 of people could be excluded”). We further coded whether this information was related to non-scientific evidence or related to the scientific DNA evidence. Mentions of DNA Evidence vs. Other Evidence. We were specifically interested in whether jurors mentioned the DNA evidence and how they captured complex evidence. When DNA evidence was mention we coded the content of the DNA reference. Mentions of the characteristics of mtDNA vs nDNA, the DNA match process or who could be excluded, heteroplasmy, references to database size, and other references were coded. Reliability. When referencing DNA evidence, we were interested in whether jurors mentioned the evidence reliability. Any specific mention of reliability of DNA evidence was noted (e.g., “MT DNA is not as powerful, more prone to error”). Expert Qualification. Finally, we were interested in whether jurors noted an expert’s qualifications. All references were coded (e.g., “Forensic analyst”). Results On average, jurors took 53 lines of notes (range: 3-137 lines). Most (83%) mentioned jury instructions before moving on to case specific information. The majority of references to evidence were gist references (54%) focusing on non-scientific evidence and scientific expert testimony equally (50%). When jurors encoded information using specific references (46%), they referenced non-scientific evidence and expert testimony equally as well (50%). Thirty-three percent of lines were devoted to expert testimony with every juror including at least one line. References to the DNA evidence were usually focused on who could be excluded from the FBIs database (43%), followed by references to differences between mtDNA vs nDNA (30%), and mentions of the size of the database (11%). Less frequently, references to DNA evidence focused on heteroplasmy (5%). Of those references that did not fit into a coding category (11%), most focused on the DNA extraction process, general information about DNA, and the uniqueness of DNA. We further coded references to DNA reliability (15%) as well as references to specific statistical information (14%). Finally, 40% of jurors made reference to an expert’s qualifications. Conclusion Jury note content analysis can reveal important information about how jurors capture trial information (e.g., gist vs verbatim), what evidence they consider important, and what they consider relevant and irrelevant. In our case, it appeared jurors largely created gist representations of information that focused equally on non-scientific evidence and scientific expert testimony. This finding suggests note taking may serve not only to represent information verbatim, but also and perhaps mostly as a general memory aid summarizing the meaning of evidence. Further, jurors’ references to evidence tended to be equally focused on the non-scientific evidence and the scientifically complex DNA evidence. This observation suggests jurors may attend just as much to non-scientific evidence as they to do complex scientific evidence in cases involving complicated evidence – an observation that might inform future work on understanding how jurors interpret evidence in cases with complex information. Learning objective: Participants will be able to describe emerging evidence about how jurors take notes during trial. 
    more » « less
  2. We study the rank of a random n × m matrix An, m; k with entries from GF(2), and exactly k unit entries in each column, the other entries being zero. The columns are chosen independently and uniformly at random from the set of all (nk) such columns. We obtain an asymptotically correct estimate for the rank as a function of the number of columns m in terms of c, n, k, and where m = cn/k. The matrix An, m; k forms the vertex-edge incidence matrix of a k-uniform random hypergraph H. The rank of An, m; k can be expressed as follows. Let |C2| be the number of vertices of the 2-core of H, and | E (C2)| the number of edges. Let m* be the value of m for which |C2| = |E(C2)|. Then w.h.p. for m < m* the rank of An, m; k is asymptotic to m, and for m ≥ m* the rank is asymptotic to m – |E(C2)| + |C2|. In addition, assign i.i.d. U[0, 1] weights Xi, i ∊ 1, 2, … m to the columns, and define the weight of a set of columns S as X(S) = ∑j∊S Xj. Define a basis as a set of n – 1 (k even) linearly independent columns. We obtain an asymptotically correct estimate for the minimum weight basis. This generalises the well-known result of Frieze [On the value of a random minimum spanning tree problem, Discrete Applied Mathematics, (1985)] that, for k = 2, the expected length of a minimum weight spanning tree tends to ζ(3) ∼ 1.202. 
    more » « less
  3. We study the log-rank conjecture from the perspective of point-hyperplane incidence geometry. We formulate the following conjecture: Given a point set in ℝ d that is covered by constant-sized sets of parallel hyperplanes, there exists an affine subspace that accounts for a large (i.e., 2 –polylog( d ) ) fraction of the incidences, in the sense of containing a large fraction of the points and being contained in a large fraction of the hyperplanes. In other words, the point-hyperplane incidence graph for such configurations has a large complete bipartite subgraph. Alternatively, our conjecture may be interpreted linear-algebraically as follows: Any rank- d matrix containing at most O (1) distinct entries in each column contains a submatrix of fractional size 2 –polylog( d ) , in which each column is constant. We prove that our conjecture is equivalent to the log-rank conjecture; the crucial ingredient of this proof is a reduction from bounds for parallel k -partitions to bounds for parallel ( k -1)-partitions. We also introduce an (apparent) strengthening of the conjecture, which relaxes the requirements that the sets of hyperplanes be parallel. Motivated by the connections above, we revisit well-studied questions in point-hyperplane incidence geometry without structural assumptions (i.e., the existence of partitions). We give an elementary argument for the existence of complete bipartite subgraphs of density Ω (ε 2 d / d ) in any d -dimensional configuration with incidence density ε, qualitatively matching previous results proved using sophisticated geometric techniques. We also improve an upper-bound construction of Apfelbaum and Sharir [ 2 ], yielding a configuration whose complete bipartite subgraphs are exponentially small and whose incidence density is Ω (1/√ d ). Finally, we discuss various constructions (due to others) of products of Boolean matrices which yield configurations with incidence density Ω (1) and complete bipartite subgraph density 2 -Ω (√ d ) , and pose several questions for this special case in the alternative language of extremal set combinatorics. Our framework and results may help shed light on the difficulty of improving Lovett’s Õ(√ rank( f )) bound [ 20 ] for the log-rank conjecture. In particular, any improvement on this bound would imply the first complete bipartite subgraph size bounds for parallel 3-partitioned configurations which beat our generic bounds for unstructured configurations. 
    more » « less
  4. Given a graph G = (V, E) and a subset T ⊆ V of terminals, a Steiner tree of G is a tree that spans T. In the vertex-weighted Steiner tree (VST) problem, each vertex is assigned a non-negative weight, and the goal is to compute a minimum weight Steiner tree of G. Vertex-weighted problems have applications in network design and routing, where there are different costs for installing or maintaining facilities at different vertices. We study a natural generalization of the VST problem motivated by multi-level graph construction, the vertex-weighted grade-of-service Steiner tree problem (V-GSST), which can be stated as follows: given a graph G and terminals T, where each terminal v ∈ T requires a facility of a minimum grade of service R(v) ∈ {1, 2, . . . `}, compute a Steiner tree G0 by installing facilities on a subset of vertices, such that any two vertices requiring a certain grade of service are connected by a path in G 0 with the minimum grade of service or better. Facilities of higher grade are more costly than facilities of lower grade. Multi-level variants such as this one can be useful in network design problems where vertices may require facilities of varying priority. While similar problems have been studied in the edge-weighted case, they have not been studied as well in the more general vertex-weighted case. We first describe a simple heuristic for the V-GSST problem whose approximation ratio depends on `, the number of grades of service. We then generalize the greedy algorithm of [Klein & Ravi, 1995] to show that the V-GSST problem admits a (2 ln |T|)-approximation, where T is the set of terminals requiring some facility. This result is surprising, as it shows that the (seemingly harder) multi-grade problem can be approximated as well as the VST problem, and that the approximation ratio does not depend on the number of grades of service. Finally, we show that this problem is a special case of the directed Steiner tree problem and provide an integer linear programming (ILP) formulation for the V-GSST problem. 
    more » « less
  5. We describe a paradigm for designing parallel algorithms via approximation, and illustrate it on the b-edgecover problem. A b-edgecover of minimum weight in a graph is a subset $C$ of its edges such that at least a specified number $b(v)$ of edges in $C$ is incident on each vertex $v$, and the sum of the edge weights in $C$ is minimum. The Greedy algorithm and a variant, the LSE algorithm, provide $3/2$-approximation guarantees in the worst-case for this problem, but these algorithms have limited parallelism. Hence we design two new $2$-approximation algorithms with greater concurrency. The MCE algorithm reduces the computation of a b-edgecover to that of finding a b'-matching, by exploiting the relationship between these subgraphs in an approximation context. The LSE-NW is derived from the LSEalgorithm using static edge weights rather than dynamically computing effective edge weights. This relaxation gives LSE a worse approximation guarantee but makes it more amenable to parallelization. We prove that both the MCE and LSE-NW algorithms compute the same b-edgecover with at most twice the weight of the minimum weight edge cover. In practice, the $2$-approximation and $3/2$-approximation algorithms compute edge covers of weight within $10\%$ the optimal. We implement three of the approximation algorithms, MCE, LSE, and LSE-NW on shared memory multi-core machines, including an Intel Xeon and an IBM Power8 machine with 8 TB memory. The MCE algorithm is the fastest of these by an order of magnitude or more. It computes an edge cover in a graph with billions of edges in $20$ seconds using two hundred threads on the IBM Power8. We also show that the parallel depth and work can be bounded for the Suitor and b-Suitor algorithms when edge weights are random. 
    more » « less