Expertise
- Sampling
- Statistical validity and significance
- Regression analysis
- Data mining, data analysis, and graphics
- Modeling and estimation
- Measurement reliability and validity
- Probability and risk analysis
- Forecasting and time series
- Quality control
- Research design
Sampling
Random Sampling has revolutionized the methods of collecting information in virtually every field. Data collection of entire populations, which used to be the only alternative, is now a rare option due to precise, statistical sampling methods.
Examples of where we have used statistical sampling:
- Class action against major auto insurer over claim handling
- Class action against major auto insurer over claim handling
- Determining dollar amount of parking fraud at Philadelphia International Airport
- Environmental education survey study
- Insurance claims sampling to determine damages I
- Sampling for financial management for the U.S. Geological Survey
- Sampling determines overtime damages economically and with agreement
- Sampling medical insurance claims files for Blue Cross Blue Shield for an audit challenge
- Superman defended by statistician
Statistical validity and significance
Sound statistical methodology and implementation assure validity in the results of a study, but they do not guarantee statistical significance, which depends on the underlying variability in the data as well as the amount of data collected and method of collection. In our work we encounter analyses that lack statistical validity as well as analyses that, though statistically valid, lack results that are statistically significant.
Examples of our projects concerning validity and significance:
- Age discrimination and a city police department
- Cheating on a police examination for promotion
- Forecasting project, where we looked at models that forced a relationship between variables, even where none existed
- Insurance project, where conclusion was based on only 1 data point per year
- Parking Case, where a questionable out-of-sample projection was made from regression
- Statistically significant differences and discrimination in hospital closures
- Race discrimination and a city police department
Regression analysis
Regression analysis is a tool to determine how one or several factors are related to an outcome of interest. The outcome might be as sales, accidents, market prices, or salary, and the factors influencing the outcome are manifold -- and the factors that influence it.
Examples of our projects concerning regression analysis:
- Construction condition survey of spalled jumbo bricks
- Commercial real estate valuation using massive dataset
- Sampling methodology in challenges to disallowances of federal aid to states
- Statistical portrait of Yonkers New York Public Schools in a desegregation suit
Data mining, data analysis, and graphics
Powerful statistical methods can be used to identify key transactions or other information, and enable estimation of rare events, yet the most incisive analysis is useless if it is impossible to explain. At Analysis and Inference, we are on the forefront not only of modern analysis techniques, but also of the use of graphical techniques to enhance our clients' understanding of our conclusions.
Examples of our use of data mining, analysis, and graphics:
- Cadillac dealer franchise location and geography of sales
- Pro bono assistance in case of government agency's finding of child abuse
- Statistical machine learning algorithm classifies customer accounts for revenue gain
- Statistical portrait of Yonkers New York Public Schools in a desegregation suit
- Tracking insurance costs for global construction company
Modeling and estimation
Quantitative modeling uses historical data to determine future behavior, infer relationships, and estimate profits, losses, or fraud and theft amounts.
Examples of our modeling projects:
- Accident prevention through statistical modeling
- Advertising strategy modeling
- Classification of insurance risk using "shrinkage" estimation
- Client response modeling in direct mail
- Discount rate in valuation of General Dynamics shipyard
- Quantifying theft from parking meters in New York City using research design and modeling
- Statistical modeling and analysis of Public Broadcast Corporation share of TV royalties
Measurement reliability and validity
Measurements may differ between measurers, or between successive measurement by the same measurer. Reliability refers to the extent to which such variation is minimized. Validity has to do with whether the measure being made is actually measuring what the quantity is understood to be, i.e., is it accurate? We want to know for any set of data - how reliable and how valid are the data?
Examples of our projects concerning measurement reliability and validity:
- Evaluation of data biases and product marketing for a pharmaceutical company
- Performance metrics in telecom
- Timeliness of delivery of special education services
Probability and risk analysis
Probability models are important wherever there is variation from unit to unit in a population, or when there is inherent uncertainty in outcomes. Risk analysis includes the systematic application of probability models to risk product reliability, accidents, and catastrophic events.
Examples of our projects concerning probability and risk analysis:
- Brittle fracture probability for engineering application
- Nuclear power plant risk assessed using insurance premiums
- Risk analysis in rear axle safety in Saturn models
- Survival and reliability analysis for medical device implant
Forecasting and time series
Data collected over time on quantities such as sales, stock price, or product quality often require forecasting and time series analysis. These analyses frequently require special procedures because data points are related to each other when they are frmo similar time periods.
Examples of our forecasting and time series projects:
- Accuracy of restatement of earnings
- Finding "changepoints" in stock prices
- Forecasting issue in federal audit of state social services department
- Predicting claim payouts for grocery chain using a statistical model
Quality control
Quality control has its roots in manufacturing, where statistical methods applied to the assembly line vastly improved both quality and production levels. These methods have been adopted in the service industry as well, where proper analysis of multi-step procedures can lead to major efficiency gains and result in better final products.
Examples of our forecasting and time series projects:
- Stipulation to use sampling to determine quality of city social service delivery
- Timeliness of delivery of special education services
Research design
Research design is at the heart of valid knowledge application. A well-designed analysis can save time and money and give precise answers. A poorly designed study may be costly but nonetheless result in imprecise and inaccurate conclusions.
Examples of our use of research design:
- Experimental design for performance of garbage trucks
- Quantifying theft from parking meters in New York City using research design
and modeling I - Quantifying theft from parking meters in New York City using research design
and modeling II
Case: Sampling methodology in challenges to disallowances of federal aid to states
The U.S. Department of Health and Human Services (HHS) is responsible for oversight of federal aid given to states to support the major benefit programs of Temporary Assistance for Needy Families (formerly AFDC or welfare), Food Stamps, and Medicaid. As such HHS administers the largest quality control sampling scheme anywhere, in cooperation with the states. Findings of rates of error from state and federal samples of payments determined by state social workers are used to reduce pro rata amounts of federal aid to states that exceeded their target maximum error rates. These penalties have run to tens of millions of dollars.
In reviewing the program, Analysis & Inference identified important and previously unnoticed flaws in the statistical estimates of the penalties derived from the estimated error rates. The sampling methods themselves were well-designed by leading authorities in sample surveys, but their use in levying penalties was shown to be subject to important biases against states.
The HHS Departmental Appeals Board upheld state challenges to the penalties based in part on our statistical critique and testimony before the Board. The statisticians who worked on the project later published the statistical analysis, which applied modern Bayesian statistical methods, in the Journal of the American Statistical Association.
Case: Performance metrics in telecom
How to test for state regulatory commissions that a competitive environment exists in local telecommunications markets.
We created statistical sampling and data analysis plans to test and verify performance and incentive plans of telecommunications providers in a series of matters before state public service commissions in New York, Florida, Georgia, Michigan, Virginia, and Colorado. Dr. Salzberg was a co-inventor (Patent #6,636,585) of performance metrics used in this work. Executing the plans required processing and testing millions of yearly transactions. We implemented sampling plans and software programs to test the data integrity of vast "raw" datasets required to test and process these transactions and we reconstructed and replicated thousands of monthly statistical tests on the telecommunications data to ensure the tests were properly coded and implemented into the phone companies' systems.
Dr. Alan Salzberg, testified before the state public service commissions with the results of the statistical analyses and testing.
Case: Determining dollar amount of parking fraud at Philadelphia International Airport
Analysis & Inference has estimated amounts of theft and fraud in large cash operations for parking installations, parking meter, and rapid transit in Philadelphia, Boston, New York, and San Diego. In the Philadelphia case, some parking collectors at the Philadelphia International Airport had for three or more years run a scam in which they took the ticket from the parker, substituted a counterfeit lower value ticket, and rang up the lower value one, keeping the difference for themselves.
We advised the insurer of the Airport against fraud and theft on the estimated amount taken by the collectors. Using several sources of airport operational data, we triangulated the estimates using statistical models and estimation. The different estimates were in reasonable agreement with those of a statistician for the Airport.
Several collectors had previously been charged and convicted. The parties negotiated the amount of the insurance claim.
Case: Superman defended by statistician
When the American Broadcasting Company (ABC) unveiled its television series, "The Greatest American Hero," in March of 1981, the copyright owners of "Superman" concluded that the show's main character, Ralph Hinkley, bore more than a coincidental resemblance to Superman. Time Warner Communications, a copyright owner, initiated suit for infringement. The case came to trial in the U.S. District Court for the Southern District of New York. As a central ingredient for their factual case, plaintiffs had commissioned a large New York firm to do a nationwide telephone survey designed to determine the degree of perceived similarity between Ralph Hinkley and Superman.
Analysis & Inference advised the attorneys for Time Warner Communications on the strengths and weaknesses of the survey commissioned by Defendants.
In the end the trial judge, on legal grounds, did not allow the survey in. Counsel for Time were nonetheless grateful for the necessary preparation. To the best of our knowledge, Analysis & Inference is the only statistics firm to have defended Superman.
Case: Accuracy of restatement of earnings
A shareholder's suit against a major service firm went to mediation. A key factual issue dealt with interpreting the statistical "time series" of the firm's stock prices, and regression analyses undertaken on behalf of the firm that found a break in the series was not traceable to failure to make proper disclosures.
Analysis & Inference identified a critical flaw in the application of statistical methods by the prominent economist who presented his findings at the mediation. The senior statistician at Analysis & Inference who did the work spoke at the mediation regarding the flaw.
Counsel for plaintiffs credited the statistician as an important contributor to the settlement that was reached.

