Technical Reports - Archive

AGL-2009-12 Debugging Support for TokenNet Computations (pdf 655kb)
Mitchell, Kevin

Agilent Research Laboratories TokenNets have been proposed as a way of simplifying the exploitation of multicore machines in the streaming domain. Providing high-level debugging support for such an approach presents a number of challenges. We describe the motivation behind the design of a TokenNet debugger and details of its implementation.

AGL-2009-11 TokenNets: An Approach to Programming Highly Parallel Measurement Science and Signal Processing (pdf 1.8mb)
Barford, Lee; Mitchell, Kevin; Vandeplas, Tom

All industries that provide value-added through the use of software are facing a "multicore crisis." Clock rates of central processing units (CPUs) stopped rising at their former exponential rate in about 2004. Instead, processor manufacturers are increasing the number of parallel execution units or cores provided per CPU. The number of cores is now increasing exponentially. In order for product performance to grow with improvements in processors, it will be necessary to program in such a way that this ever-growing number of parallel processors is used effectively.

AGL-2009-1 The Agilent Protocol Encoder (APE) (pdf 1.8mb)
Mitchell, Kevin

The Agilent Protocol Encoder (APE) was designed as a test-bench to explore various aspects of protocol speci cation. In addition to designing a domain-speci c language for specifying communication protocols, the project built an integrated development environment (IDE) to assist in the construction and debugging of such speci cations. The concept of action-directed decoding was developed to allow a range of di erent decoders to be constructed from the same speci cation. The compiler can generate code for the APE virtual machine (APE VM), with an instruction set tailored to the task of protocol decoding. A rmware implementation of this VM has also been developed, along with a more traditional back-end that generates C++ code. This paper summarizes the main contributions of the APE project. A companion report describes the design of the APE virtual machine and its implementation in rmware.

AGL-2005-7 NRSS: A Protocol for Syndicating Numeric Data (pdf 197kb)
Liu, Jerry; Purdy, Glen; Warrior Jay; Engel, Glenn

This paper proposes NRSS (Numeric Really Simple Syndication), a protocol for syndicating numeric data over the web. It builds upon RSS (Really Simple Syndication) version 2.0, a popular protocol used to syndicate headlines for news stories around the web, and the data model from IEEE 1451.1, a standard for representing and describing numeric measurement data. This note provides an overview to NRSS and outlines some possible usage scenarios. It also describes how NRSS extends regular RSS and illustrates how a NRSS numeric summary feed can be constructed via some examples.

AGL-2002-4 Overabundance Analysis and Class Discovery in Gene Expression Data
(pdf 343kb)
Ben-Dor, Amir; Friedman, Nir; Yakhini, Zohar

Recent studies (Alizadeh et al. 2000, Bittner et al. 2000, Golub et al. 1999) demonstrate the discovery of disease subtypes from gene expression data. In this paper, we propose a principled and systematic approach to address the computational problem of partitioning the set of sample tissues into statistically meaningful classes. We start by describing a method, called overabundance analysis, for assessing how informative a given expression dataset is with respect to a partition of the samples. As we show, in several published expression datasets, an overabundance of genes separating known classes is observed. Then, we use this method as the foundation to a novel approach to class discovery. In this approach, we search for partitions that have statistically significant overabundance score. We evaluate the performance of our approach on synthetic data, where we show it can recover planted partitions. Finally, we apply it to several published tumor expression datasets, and show that we find several highly pronounced partitions.

AGL-2000-13 Scoring Genes for Relevance (pdf 197kb)
Ben-Dor, Amir; Friedman, Nir; Yakhini, Zohar

Recent molecular-level studies that compare different classes of disease conditions produce labeled gene expression data. We examine scoring methods that are useful in mining such gene expression data for genes that have biological relevance to the condition studied. Relevance information is useful in identifying genes driving the biological process, in selecting small subsets of genes with diagnostic potential, and in better understanding the condition studied and its relationship to known or hypothesized biochemical pathways. We present the scoring methods; describe a process for computing the corresponding p-values; and finally, present results from application to actual cancer gene expression data. These include applying classification techniques employing varying relevance based selected sets of genes.