FLM2 Replication Files

This repository contains replication files for Farrell, Liang, and Misra (2026), arXiv:2010.14694, version 4.

A separate Python implementation is available here, though it has not been tested for this repository:

https://deep-inference.readthedocs.io/en/latest/index.html

Repository Overview

The repository includes replication code and example datasets used to illustrate methods related to post-machine-learning inference, nonlinear binscatter methods, empirical applications, and simulation evidence.

The current data dictionary covers two empirical datasets and one simulation study:

Application	Main data or target	Source
Bertrand et al. consumer credit marketing experiment	`adcontentworth_qjecsv.csv`	Bertrand, Karlan, Mullainathan, Shafir, and Zinman (2010), The Quarterly Journal of Economics
American Community Survey zip-code-level data	`CCFF_2024_ACS_2.csv`	Cattaneo, Crump, Farrell, and Feng, “Nonlinear Binscatter Methods”
Simulation study	`mu = E[beta(X)]` in a linear-in-treatment model	Monte Carlo design for Farrell, Liang, and Misra (2025)

Data Dictionary and Code Guide

Dataset 1: Bertrand et al. Consumer Credit Marketing Experiment

File name: adcontentworth_qjecsv.csv

This dataset comes from Bertrand, Karlan, Mullainathan, Shafir, and Zinman (2010), “What’s Advertising Content Worth? Evidence from a Consumer Credit Marketing Field Experiment,” The Quarterly Journal of Economics, 125(1):263–306.

The data are from a large-scale field experiment run on behalf of a financial institution in South Africa. Consumers were sent marketing materials for short-term loans in which both the interest rate and features of the advertising content were randomized.

Code Files

File	Description
`FLM2_Bertrand_step0_functions.R`	Defines the structured neural-network estimator for the Bertrand application and the `H` functions used to define semiparametric inference targets of the form `mu = E[H(...)]`.
`FLM2_Bertrand_step1_fittingDNNs.R`	Fits the first-stage structured neural networks for the Bertrand loan application data and saves cross-fitted DNN model objects for later inference.
`FLM2_Bertrand_step2_InferenceStep.R`	Loads the saved first-stage DNN fits, estimates the `Lambda(x)` objects needed for the influence-function correction, and computes semiparametric inference for targets such as marginal effects and optimal profits.
`FLM2_Bertrand_valueOfStructure.R`	Demonstrates the value of structural restrictions by comparing random forests, neural networks, and a structural binary-choice logit model for demand estimation and profit optimization.

Variables

Variable	Description
`offer4`	Randomly assigned monthly offer interest rate for the four-month loan, measured in percentage-point units, for example 8.2 for 8.2% per month.
`speak_trt`	Indicator that the mailer included the language-affinity message “We speak [client’s language]” for eligible clients whose primary language was not English.
`stripany`	Indicator that the mailer included a rate-description strip or banner saying either “A special rate for you” or “A low rate for you.”
`dphoto_none`	Indicator that the mailer did not include a person’s photograph.
`dphoto_black`	Indicator that the mailer included a photograph of a Black person, as opposed to another photo race category or no photo.
`dphoto_female`	Indicator that the mailer included a photograph of a woman, as opposed to a male photograph; no-photo cases are separately captured by `dphoto_none`.
`prize`	Indicator that the mailer mentioned the promotional cell-phone raffle.
`oneln_trt`	Indicator that the example-loan table showed one example loan rather than four example loans.
`use_any`	Indicator that the suggested-use line gave only the general message that the client could use the cash or loan for anything, rather than naming a specific use such as school, debt repayment, appliance purchase, or home repair.
`intshown`	Indicator that the example-loan table displayed the interest rate in addition to the monthly repayment information.
`comploss_n`	Indicator that the competitor-rate comparison was framed as a loss, for example “If you borrow elsewhere, you will pay … more,” rather than as a gain.
`comp_n`	Indicator that the mailer included any comparison to a competitor or outside rate, as opposed to no competitor-rate comparison.
`waved3`	Indicator for the later mailer/randomization wave, corresponding to the October mailing wave rather than the September wave.
`dormancy`	Number of months since the client’s most recent prior loan from the lender.
`trcount`	Number of previous loans the client had taken from the lender.
`female`	Indicator that the client is female.
`race`	Client race category, used both as a covariate and for photo-race matching; the paper reports African, Indian, White, and Mixed/“Colored” categories.
`nspeakeligible`	Indicator that the client was eligible for the language-affinity treatment because the client’s primary language was not English.

Dataset 2: American Community Survey

File name: CCFF_2024_ACS_2.csv

This dataset comes from Cattaneo, Crump, Farrell, and Feng, “Nonlinear Binscatter Methods,” arXiv:2407.15276.

The original data construction and replication repository are linked from the nppackages replication page:

https://nppackages.github.io/replication/

The data are obtained from the American Community Survey using five-year survey estimates beginning in 2013 and ending in 2017, available from the U.S. Census Bureau. The analyses are performed at the zip code tabulation area level for the United States, excluding Puerto Rico.

Code Files

File	Description
`FLM2_ACS(1).R`	Replicates the CCFF binscatter-style figure for uninsured rates by income and population-density group, then applies the FLM2 estimation and inference procedure to the ACS application.

Variables

Variable	Description
`uninsuredRate`	Percent of people without health insurance.
`perCapitaIncome`	Per capita income.
`idxpopdens`	Indicator that divides states into low- and high-population-density groups, using 100 people per square mile as the cutoff; state population density is defined as average population per square mile using Census Bureau data.

Simulation Study

The simulation study evaluates inference for mu = E[beta(X)] in a linear-in-treatment model of the form Y = alpha(X) + beta(X)T + epsilon, with one-dimensional covariates and treatment. The design compares full-sample and cross-fitted neural-network estimators, automatic-differentiation corrections, auto-DML, and generalized random forest benchmarks under configurable treatment assignment and heteroskedasticity.

Code Files

File	Description
`FLM2_simuls_1_functions.R`	Defines the simulation data-generating process, neural-network estimators, auto-DML routines, one-replication simulation function, and helper functions for summarizing results.
`FLM2_simuls_2_run.R`	Sets the simulation design parameters, runs the Monte Carlo replications in parallel, saves the resulting simulation objects, and writes LaTeX tables and diagnostic output.

References

Bertrand, Marianne, Dean Karlan, Sendhil Mullainathan, Eldar Shafir, and Jonathan Zinman. 2010. “What’s Advertising Content Worth? Evidence from a Consumer Credit Marketing Field Experiment.” The Quarterly Journal of Economics 125(1):263–306.

Cattaneo, Matias D., Richard K. Crump, Max H. Farrell, and Yingjie Feng. “Nonlinear Binscatter Methods.” arXiv:2407.15276.

Farrell, Max H., Tengyuan Liang, and Sanjog Misra. 2025. arXiv:2010.14694, version 3.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FLM2 Replication Files

Repository Overview

Data Dictionary and Code Guide

Dataset 1: Bertrand et al. Consumer Credit Marketing Experiment

Code Files

Variables

Dataset 2: American Community Survey

Code Files

Variables

Simulation Study

Code Files

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
CCFF_2024_ACS_2.csv		CCFF_2024_ACS_2.csv
FLM2_ACS.R		FLM2_ACS.R
FLM2_Bertrand_step0_functions.R		FLM2_Bertrand_step0_functions.R
FLM2_Bertrand_step1_fittingDNNs.R		FLM2_Bertrand_step1_fittingDNNs.R
FLM2_Bertrand_step2_InferenceStep.R		FLM2_Bertrand_step2_InferenceStep.R
FLM2_Bertrand_valueOfStructure.R		FLM2_Bertrand_valueOfStructure.R
FLM2_simuls_1_functions.R		FLM2_simuls_1_functions.R
FLM2_simuls_2_run.R		FLM2_simuls_2_run.R
LICENSE		LICENSE
README.md		README.md
adcontentworth_qjecsv.csv		adcontentworth_qjecsv.csv

Folders and files

Latest commit

History

Repository files navigation

FLM2 Replication Files

Repository Overview

Data Dictionary and Code Guide

Dataset 1: Bertrand et al. Consumer Credit Marketing Experiment

Code Files

Variables

Dataset 2: American Community Survey

Code Files

Variables

Simulation Study

Code Files

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages