This file (which should be named cdn2.zip) contains the data and source code necessary for the paper's replication.
The data used for this paper comes from the first Panel Study of Entrepreneurial Dynamics (or PSED I). A PSED home page at Clemson University describes their collection and structure. That site distributes an SPSS formatted data set, which we converted to Stata format for our work. Our paper uses questions from only the first wave. We used the questionnaire from that interview to guide us through the available information. We also found the questionnaire from the initial screening interview to be helpful.
As a precaution against the Clemson website someday being taken down, this replication file contains copeis of both questionnaires in the /data directory. That directory also contains the data set in Stata format.
There currently exists a second PSED home page at the University of Michigan. This includes codebooks and data from the second PSED, which we do not use in this paper. However, we were unable to locate the PSED I data set there.
listtex module installed We use listtex to directly export Stata results into LaTeX tables and macros. natbibhyperrref and hypernatsetspacepgf/tikzpgfplots\pgfmathprintnumber command for formatting numbers included in the text.rotating\sidewaystable command. bbdingbbding dingbat font available. Used to place an envelope next to the corresponding author's name.todonotespgf/tikzversions Stata does most of the calculations, and we use Matlab only to create the figures. (In retrospect, it would have been better to make these within LaTeX using the pgf and pgfplots packages.) The paper's text is written in LaTeX, which we process with pdflatex and bibtex. We bring the whole enterprise together with a set of makefiles.
Stata and Matlab are both commercial programs, available from their respective vendors. Any self-respecting research institution should have licenses for their use. The last three programs are all part of the standard Linux installation.
We include todonotes.sty in the replication file because it is not yet part of the standard TeXLive distribution of TeX and its friends. If you do not have some of the the other required LaTeX packages, we suggest that you or your system administrator download and install the latest TeXLive distribution (for Unix and Mac) or MiKTeX distribution (for Windows). (We have tested the programs with the 2008 release of TeXLive.)
Since we only use gnu make to automate the paper's construction, the paper can be reproduced by running the individual files in sequence by hand. On our Linux machine, the commands to start Stata and Matlab are stata-se and matlab. If these are different on your machine, then you will have to change each directory's makefile appropriately before using make.
These instructions assume that the relevant Unix machine's hard drive already has a copy of cdn2.zip. If this is not the case, Microsoft Windows users can install WinSCP and use it to transfer the file.
To begin, log into the machine with your favorite client and start a terrminal/ssh session. Change your working directory to the directory containing the replication file, and then issue these commands.
mkdir cdn2 (Creates an empty directory for the replication work)mv cdn2.zip cdn2 (Moves cdn2.zip into the new directory)cd cdn2 (Changes the working directory to the new directory)unzip cdn2.zip (Extracts the files for replication.)make cdn2.pdf.
If something goes wrong in one of the first four steps, then you probably do not have permission to write in the current directory. Contact your system administrator or a local Unix expert for help. An error in the fifth step probably indicates that the machine does not have the required software. If necessary, type Control-C to abort the replication. Then get help from a local expert.
Windows does not come with any flavor of Last modified on June 17, 2010make, so replicating the paper under Windows requires running each of the programs by hand. After unzipping psed.do in the data directory. .do files in the Tables and Figures directories. The order in which these are run is irrelevant..m programs in the Figures directory. Again, these can be run in any order.pdflatex cdn2. This should generate lots of complaints about missing references and cross references.bibtex cdn2Auditing the paper's results
To ensure accuracy, we typed no quantitative result into the paper by hand. Instead, every number is generated by a Stata program and written to a text file. LaTeX reads these files and places the numbers in the appropriate places. If the number is spelled out in the text (as at the beginning of a sentence), then the text contains a margin note with its automatically generated counterpart. This scheme guarantees that there exists an audit trail for every result. To follow it, you can take one of two approaches.
cdn2.tex and find the code generating the number. Inline numbers are created from LaTeX macros (e.g. \amacro. The first part of the macro indicates which Stata file generated it. (So \obsXX was generated by obs.do.)
Manifest
annals-cover.jpgcdn.bibcdn2.texmakefiletodonotes.stytodonotes package./data/erc_q1.pdf/data/erc_sc.pdf/data/ercw14s.dta/data/makefile/data subdirectory./data/psed.do/Figures/demofig_fe.do/Figures/demofig_me.m/Figures/demofig_ma.do/Figures/demofig_ma.m/Figures/hcfig_fe.do/Figures/hcfig_fe.m/Figures/hcfig_ma.do/Figures/hcfig_ma.m/Figures/makefile/Figures subdirectory./Figures/moneyfig_fe.do/Figures/moneyfig_fe.m/Figures/moneyfig_ma.do/Figures/moneyfig_ma.m/Figures/parentfig_fe.do/Figures/parentfig_fe.m/Figures/parentfig_ma.do/Figures/paentfig_ma.m/Tables/allhours.do/Tables/allinvest.do/Tables/anticipatedSize.do/Tables/conception.do/Tables/externalFundsPartner.do/Tables/externalFundsSolo.do/Tables/female.do/Tables/fundsNeeded.do/Tables/incomeresponse.do/Tables/industry.do/Tables/LegalForm.do/Tables/makefile.do/Tables subdirectory./Tables/obs.do/Tables/partners.do/Tables/representative.do/Tables/respondenthours.do/Tables/sponsorship.do/Tables/stage.do/Tables/stageTwo.do/Tables/TimeUse.do/Tables/wealth.do