RESEARCH OVERVIEW
The Department of Computer and Information Science at Polytechnic Institute of NYU is active in research in a number of key areas of Computer Science. Our research has been funded by grants from government agencies such as the National Science Foundation, NASA, the Office of Naval Research, the Air Force, and the New York State Office of Science, Technology, and Academic Research, and private companies and foundations such as IBM, Hewlett-Packard, AT&T, the Sloan Foundation, Panasonic, Intel, and Verizon.
Following is a brief overview over the main research areas represented in the department, together with pointers to additional information. Most of the research efforts in the department can be grouped into the following four broad areas. Follow the links for a detailed description of work in each area:
- Algorithms and Complexity, including computational geometry, combinatorial optimization, parallel algorithms, learning theory, applied algorithmics
- Visual Information Processing, including computer vision, image processing, image compression, graphics, visualization, multimedia
- Software Engineering, including software testing, verification, security
- Databases and Information Retrieval, including database systems, client server databases, data mining, text categorization, web search & analysis
- Other Areas
In addition, active research efforts are also underway in several other areas, including parallel & distibuted computing, wireless & mobile computing, and computer architecture; see here for some information. Additional computer-related research, in particular in computer networking, image processing, and computer architecture, is done in the Electrical and Computer Engineering department, and in the Center for Advanced Technology in Telecommunications (CATT), a large state-sponsored research center shared by the two departments.
Algorithms and Complexity
(Profs. Aronov, Brönnimann, Chiang, Hellerstein, Suel, van Slyke, Wein)
The department has significant research strength in the algorithms area, with about half of our current faculty being actively involved in work ranging from fundamental theoretical problems in complexity to the experimental evaluation of algorithmic techniques.
Computational Geometry: Several faculty members are working on problems in the area of Computational Geometry, which is concerned with fundamental techniques for computing with geometric objects, and forms the theoretical basis for much work in applied areas such as Computer Graphics and Vizualization. In particular, Prof. Aronov has been studying a variety of questions in Combinatorial Geometry, i.e., the study of the combinatorial properties and resulting fundamental complexities of geometric problems. Prof. Brönnimann's work is concerned with theoretical issues in Computational Geometry as well as with practical problems arising in the context of programming libraries and object-oriented techniques for geometric computing. Prof. Chiang is studying geometric problems in graphics and visualization, with emphasis on I/O-efficient techniques for processing massive geometric data sets.
Approximation Algorithms & Combinatorial Optimization: Research in this area is concerned with precise and approximate solutions to hard algorithmic problems arising in many real-world situations, such as the optimal design of computer networks and other infrastructure, the scheduling of tasks to ensure the efficient use of limited resources, or the partitioning of data and work between nodes of a parallel machine. Two faculty members in our department, Profs. van Slyke and Wein, primarily focus on problems in this area. Prof. van Slyke has extensively worked on problems in combinatorial optimization theory and the design of information networks; among his recent works is a new distributed version of the simplex method, and a non-cooperative game model of flow control in computer networks. Prof. Wein has focused on scheduling and resource allocation problems, including general conversion techniques for transforming off-line into on-line scheduling algorithms, task scheduling in generalized communication networks, combinatorial problems in scheduling independent jobs in parallel machines, efficient algorithms for the dense assignment problem, and fractiling in parallel computation. Several other faculty members have also worked on approximation algorithms for a variety of flow, partitioning, and geometric problems.
Computational Learning Theory: Prof. Hellerstein is working on fundamental problems in the area of Computational Learning Theory, an area concerned with algorithms and fundamental limits for machine learning. Much of Prof. Hellerstein's work concerns query models of learning, in which the learner can ask questions of a teacher or can perform experiments. She has developed a number of efficient algorithms for learning particular classes of concepts in query models. She has also developed new techniques for determining when a class of concepts is learnable in query models, and for designing algorithms that remains efficient in the presence of irrelevant information. Prof. Hellerstein is particularly interested in the learnability of classes of Boolean functions and has done related research on characterizations of Boolean function classes. Recently, Prof. Hellerstein has also begun working on learning-based approaches in information retrieval.
Parallel and Distributed Algorithms: This area is concerned with the design of efficient algorithms for massively parallel machines, networks of workstations, and wide-area distributed networks. Two faculty members, Profs. Suel and Wein, have worked extensively in these areas. Prof. Suel's research has been concerned with the theoretical foundations of fixed-connection networks, such as meshes, hypercubes, or other types of graphs, and he has shown fundamental upper and lower bounds for sorting and communication problems on these networks. Another area of interest to him are models of parallel computation that allow and encourage the design of programs that are efficiently portable among different parallel architectures. Prof. Wein's research in this area is concerned with load balancing and partitioning techniques that schedule computations in a distributed environment in ways that perserve the locality properties of the underlying problem and adapt to changes in resource availability. Both faculty are also working with colleagues to apply their algorithmic results to real-life parallel computing problems, including simulations of particle and semiconductor physics, signal propagation, and nuclear stockpile testing.
Visual Information Processing
(Profs. Chiang, Gluckman, Memon, Wong)
Another major research area in the department is concerned with the analysis, storage, transmission, and synthesis of images, video, and multimedia data. A group of faculty is working on a variety of problems in this domain, including Computer vision and image analysis, image and video compression and transmission, watermarking & protection mechanisms, computer visualization, and computer graphics. Work is supported by multiple grants, and also involves cooperation with several faculty in the Electrical Engineering department.
Image/Video Analysis & Pattern Recognition: Prof. Wong has worked extensively on the analysis of image and video data. A lot of his current research focuses on the analysis of document images, such as maps or engineering drawlings, and on information retrieval from images and video. For example, his work has resultied in a number of novel techniques and algorithms for noise filtering, segmentation, and lossy compression of engineering drawings. The segmentation process separates an engineering drawing into separate layers, which can then be compressed, retrieved, and analyzed separately for best overall performance. Prof. Wong's lossy compression algorithm, based on straight-line-extraction and higher-order-context-modeling techniques, allow engineering drawings to be compressed down to about 1% of its original size without loss of visual quality or useful information.
Prof. Wong has also developed novel techniques to retrieve images from image/video databases. With the rapid advances in digital technology, more and more databases are multimedia in nature, containing images and video in addition to textual information. Currently, most video databases are manually indexed based on textual annotations in an often tedious and time consuming process. Using a computerized approach, indexing and retrieval are performed based on features extracted directly from the video. One novel feature he developed was the Augmented Histogram where spatial information is added to a color histogram of an image, improving retrieval precision and recall over conventional histograms.
Computer Vision: Advances in both image sensors and the processing of image data will lead to devices that can intelligently gather information about the world and create digital models that mimic the real world in their complexity. These digital models will be used for scientific visualization, education, robotic exploration, and historical preservation.
The goal of Prof. Gluckman's research is to develop new imaging technologies and computational techniques for accurately sensing important scene properties such as shape, motion and reflectance. In particular, we are interested in designing stereo vision sensors and camera motion sensors. The stereo sensors he is developing have led to real-time systems for capturing three dimensional information. By altering the distribution of viewing rays, new cameras are being designed that simplify the estimation of camera motion. When coupled with new processing techniques, these devices will be able to rapidly and accurately measure the motion and shape of a moving object.
Multimedia Compression: Another active area of research is concerned with compression techniques for various forms of image and multimedia data. One faculty member, Prof. Memon, was involved in an international collaboration for designing a novel lossless compression scheme in response to the JPEG (Joint Pictures Expert Group) committee of the International Standards Organization's (ISO) call for proposals for a new standard. His scheme, called CALLIC ranked first in the evaluations conducted by ISO, and significantly influenced the adopted standard. Other work in this area includes compression techniques for volumetric data sets, by Profs. Chiang and Memon, document compression research by Prof. Wong (see above), and work on compressing HTML data and web graph structure by Profs. Memon and Suel.
Multimedia Content Protection: Research in this area focuses on techniques for preventing or detecting unauthorized use, copying, or alteration of multimedia content, such as images, audio, or video. One basic approach is based on Digital Watermarking, where a usually imperceptible signal is added that identifies the source, establishes ownership, or detects alteration of the data. In his work, Prof. Memon has demonstrated several serious weaknesses in current watermarking techniques, and has developed new techniques that overcome these shortcomings. His work on secure distribution of multi-media content is performed in collaboration with researchers in Intel, Hewlett-Packard and Panasonic. Other related work in the department concerns watermarking of documents (Prof. Wong) and of software (Profs. Memon and Naumovich).
Computer Graphics & Visualization: Work in this area is concerned with the synthesis of realistic images, and the visualization of highly complex data sets such as those obtained from scientific simulations and measurements and from medical applications. Recent advances in three-dimensional acquisition, simulation, modeling and virtual-reality techniques have led to massive datasets that exceed the main memory size and the interactive rendering capability of current graphics hardware. As the complexity of graphics datasets increases, I/O-efficiency becomes more and more important, but very few of the existing techniques explicitly consider this issue. Isosurface extraction is one of the most effective techniques for visualizing and studying volumetric data sets where objects are given by 3D sample points over their volume. This techniques allows us to visualize the interior of the objects and study them in detail; however, the data sets involved tend to be huge, making I/O-efficient techniques extremely important. Prof. Chiang works on developing isosurface techniques based on I/O-optimal indexing structures and advanced partitioning methods that achieve a speed-up of one to two orders of magnitude in query time for large data sets, using disk space of only 1.1-1.5 times the original data size.
Prof. Chiang also developed the first conflict prediction code based on geometric hashing to support free flight in Air Traffic Control, which is now being used by Seagull Technology in a NASA project. In addition, he devised the first out-of-core techniques that can efficiently perform both progressive-mesh simplification and view-dependent rendering for polygonal models larger than main memory, and received the Best Paper Award in Eurographics 2000.
Software Engineering
(Profs. Frankl, Memon, Naumovich)
Software engineering research in the department includes work on software testing techniques, finite state verification techniques, and network and application security.
Software Testing: Prof. Frankl has worked extensively on problems in testing large software systems, including theoretical aspects of software testing, comparing effectiveness of software testing techniques, and building testing tools. She played an important part in the development of data flow testing. In a program, a variable is often defined and used in several different statements -- data flow testing techniques decide to include a test case in a test suite if this test case executes certain combinations of statements for a given variable. Prof. Frankl has also developed a testing technique for object-oriented programs and worked in the area of regression testing -- techniques for efficiently testing a new version of a program when a previous version has already been tested.
Prof. Frankl's and her students' current work includes the development of techniques for testing database applications, and the experimental comparison of the effectiveness of several promising coverage-based testing techniques. Coverage-based techniques measure the quality of a test set by how fully this set ``exercises'' a specific aspect of a program. For example, for a test set to achieve a 100% statement coverage, each statement in the program has to be executed by at least one test case from this set. The research is evaluating how good such techniques are at detecting bugs. One of the challenges of testing database applications is that the test cases are represented not only by the conventional parameters passed to the application, but also by (potentially very large) tables in the database.
Finite State Verification: This research area addresses the problem of quality assurance of large distributed software systems. Finite state verification (FSV) is a family of techniques that use a finite model to represent a software system. The finite models are then analyzed for specification errors. FSV techniques take all possible inputs of a system into account and are largely automated, thus providing an important alternative to both testing and formal verification techniques. Prof. Naumovich has participated in the development of a FSV technique that can be applied to distributed Ada and Java programs.
Currently, the main interests of Prof. Naumovich in this area lie in improving performance of FSV techniques by optimizing algorithms that they use. Since several studies have demonstrated that even simple optimizations can dramatically improve performance of FSV tools, he hopes that a systematic analysis of the nature of different FSV techniques and previously proposed optimizations will lead to powerful new optimizations, making FSV applicable to much larger distributed systems than at present. A substantial part of the work in this direction will involve extensive experimental work on analyzing realistic distributed programs with FSV tools.
Another promising research direction is the collaboration of Profs. Frankl and Naumovich on combining testing and FSV in a synergistic manner. This approach attempts to use information from FSV analyses to investigate the reported problems with testing techniques and also to use information from testing in the form of heuristics to improve modeling used in FSV. Presently, a tool to be used for empirical evaluation of this approach is being constructed.
Analysis of Network and Application Security: Profs. Frankl, Memon, and Naumovich are interested in various aspects of network and application security. Currently, their interests are concentrated in the area of improving confidence in Java programs that may be attacked by hostile applets. Hostile applets (or, more generally, classes) can exploit weaknesses in Java systems by obtaining, in one way, or another, permissions to perform system-critical operations, such as accessing data on local disks. The main direction of the current work is to formulate important requirements on Java programs and develop techniques for validating these requirements. Such techniques will be based on both finite state verification approaches and testing approaches.
Virus protection is an important aspect of network security. Prof. Frankl has worked on detection of macro viruses using a range of static analysis techniques. Results of this work have been used in a virus detection tool by IBM.
Software Watermarking: Software piracy is extremely widespread (estimated at USD 15 Billion in 1999 alone). It is important that the legitimate developers of a software system be able to prove their authorship of the software. Software watermarking is a technique for embedding a message identifying the author in a software system. Profs. Memon and Naumovich are interested in the problem of watermarking Java programs. Currently, they are working on a dynamic watermarking technique, based on embedding a message in the run-time state of a program. Profs. Memon and Naumovich are also interested in several other techniques for preventing software piracy, including authentication and program obfuscation techniques.
Databases And Information Retrieval
(Profs. Delis, Hellerstein, Memon, Suel)
The fourth major research concentration in the department is concerned with the management, querying and analysis of large data sets, and includes the areas of database systems, data mining, information retrieval, and web search and exploration. Work is performed in several labs and research groups, with emphasis on algorithmic and architectural issues.
Client-Server Databases: Prof. Delis and his students in the Database Systems Lab are working on architectural and performance issues for client-server databases. Most modern databases are organized following variants of the Client-Server model, where a number of clients (e.g., PCs) interact with one or more servers that use database engines to retrieve data and serve it to the clients. Prof. Delis' work, supported by an NSF Career Award, focuses on performance issues in such architectures, where a naive implementation quickly leads to a performance bottleneck at the server. He has studied the scalability of the standard two-tier Client-Server model, and has proposed a three-tier model that employs a number of optimization techniques, including caching, prefetching, and client clustering, to scale to larger numbers of clients.
Query Processing and Optimization: Database systems have to be able to efficiently process highly complicated queries on large amounts of data. To achieve this, systems use a variety of techniques, such as highly optimized index structures for accessing the data, or query optimization for finding the best way to execute a query. Several faculty members are working on new techniques in this area, including index structures, cost estimation techniques, approximate query answers, and efficient operations in spatial databases. In particular, Prof. Delis has studied the performance characteristics of common index structures for disk- and memory-resident data, and the efficient implementation of temporal query operations. Prof. Hellerstein is working on new techniques for selectivity and cost estimation of database queries, and has worked on efficient coding schemes for parallel disk architectures (RAID) and methods for generating random range queries. Prof. Suel is working on problems in selectivity estimation, data partitioning, sampling and approximation techniques for query results, and query processing in spatial databases.
Intelligent Information Retrieval and Text Mining: Prof. Hellerstein is working on problems in intelligent information retrieval, such as learning to automatically categorize documents by topic, and learning to extract information from documents. Her work in this area, supported by a grant from the National Science Foundation, focuses on learning-based approaches based on fundamental results from Computational Learning Theory. Prof. Hellerstein is also leading a reading group of faculty and students focusing on current developments in information retrieval and machine learning.
Web Search and Analysis: One of the most fundamental problems facing the World Wide Web is how to efficiently find the desired information among the more than one billion currently accessible web pages. A large amount of industrial and academic work over the last few years has focused on this problem, and powerful search engines (such as AltaVista and Google) have been built using massive amounts of hardware. However, the basic search problem is far from resolved, and new challenges arise constantly as the web evolves.
Prof. Suel and his students are working on techniques for improving the efficiency of web search. Besides improving the quality of the search results, it is also important to improve computing and storage efficiency in order to keep up with the growth of the web and allow a deployment on more modest hardware. Closely related problems of interest are those of exploring or analyzing the structure and properties of the web and of efficiently storing and archiving the content of the web. Search engines need to be able to store massive amounts of encountered pages, and many techniques used by current search engines to rank results are based on analyzing and exploiting the hyperlink structure of the web.
Prof. Suel's research in this area, performed in the recently opened Web Exploration and Search Technology Lab (WestLab), looks at a number of problems in this context, ranging from system building to formal algorithm design and analysis, and includes the storage and compression of large web page collections, efficient data acquisition (crawling), analysis of the web graph and structure, and support for powerful query operations on web archives. Parts of this work are also performed in collaboration with Profs. Delis, Hellerstein, and Memon.
Other Areas
Computer Architecture: In the area of Computer Architecture, current research by Prof. Haldun Hadimioglu focuses on the design of packet processors for high speed networks and the employment of dynamic program and data compression schemes to reduce the effect of the processor-memory speed gap.
Wireless and Mobile Computing: The Wireless Information Services Research Group, involving several faculty from CIS and EE, studies new paradigms for transmitting information to and from mobile and hand-held devices. The current focus of research are infostations, which provide high bandwidth wireless connectivity in certain limited areas of coverage. Related work in the Visual Information Processing and Web Exploration and Search Technology Labs, supported by Intel Corporation, focuses on transmission and navigation of Web and Multimedia content on handheld devices.
|