Anteneh Tesfaye, M.S.

Software Engineer and Data Specialist

Ask A+I’s Software Engineer and Data Specialist, Anteneh Tesfaye, what he does, and the word you’ll most likely hear is “concatenate.” It means to link, draw parallels, and put the pieces together. In addition, through long association with A+I, Anteneh has become adept at R programming and the application of computer methods in statistics.

“It’s hard to tell a story when the data is all over the place,” says Anteneh, who earned undergraduate and Masters’ degrees in Computer Science from Swarthmore College and Johns Hopkins University. “My job is to deliver clean, organized data sets that permit the clearest, simplest, and most reliable analysis.”

“Creating clean, documented data sets is critically important to the accuracy of our work, but being in the trenches is not for the faint of heart. Hundreds of Excel, CSV or PDF files may be scattered across disparate locations. Sometimes we have to scrape the data from a website. And the raw data is almost always unclear. What do the numbers or categories represent? What units of observation have been measured? How are samples chosen? What’s the population? How, when and where were the measurements made? Are there missing values, data errors, or missing documentation?  How can they be detected and fixed?

“For each data set, I document what data is represented, where it came from, how it was obtained, and what calculations were used to arrive at it, so that others can reproduce our results. Then I attempt to replicate analyses prepared by others to verify that their data is accurate.

“The same data can tell fifty different stories. At first, you don’t know what you have. Once I start assembling and cleaning, a story starts to emerge. Will and Bill provide the questions the data needs to answer, and I lead the researchers collecting the data. It’s like a feedback loop. We all work in unison to ensure that our complex data set is on target.

“It’s fascinating work. No two projects are the same. Programming is such a dynamic field, and thinking like a data scientist adds a new dimension to my work.”