Wed, January 11, 2023
Public Access

Category: All

January 2023
Mon Tue Wed Thu Fri Sat Sun
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
4:00pm [4:00pm] Rakhi Singh:The State University of New York at Binghamton, New York, USA
Seminar on Data Science

Date and time: Wednesday, January 11, 2023, 4 pm
Venue: Ramanujan Hall
Host: Ashish Das

Speaker: Rakhi Singh
Affiliation: The State University of New York at Binghamton, New York.

Title: Subdata selection: Introduction and Recent Works

Abstract: Data reduction or summarization methods for large datasets (full data) aim at making inferences by replacing the full data by the reduced or summarized data. Data storage and computational costs are among the primary motivations for this. In this presentation, data reduction will mean the selection of a subset (subdata) of the observations in the full data. While data reduction has been around for decades, its impact continues to grow with approximately 2.5 exabytes (2.5 x 10 18 bytes) of data collected per day. We will begin by discussing an information-based method for subdata selection under the assumption that a linear regression model is adequate . A strength of this method, which is inspired by ideas from optimal design of experiments, is that it is superior to competing methods in terms of statistical performance and computational cost when the model is correct. A weakness of the method, shared with other model-based methods, is that it can give poor results if the model is incorrect. We will therefore conclude with a discussion of a model-free method. The work discussed here is a joint work with John Stufken at George Mason University, USA.