Jump to Navigation
Home

Main menu

  • Home
  • News
  • Markets Map
  • Sentiments
  • Topics
  • Data
  • Comments
  • Images
  • Blog
  • About

Secondary menu

  • Latest News
  • Top Rated
  • Most Popular
  • Archive
  • Discussions
  • Model catches eyes at C. China auto show
  • Everest men: On top of the world in 1963
  • Weekly review: Market snaps five-week rally, Sensex,...
  • Chidambaram to visit Dubai tomorrow to woo investments
  • “The United States is the only advanced economy in the...
  • China's pension insurance covers 486m people
  • Throwing cold water on hot money
  • Pfizer Will Reward Shareholders With Zoetis Spin-Off
  • Top trading strategies for the coming week
  • The Brady Bunch Market

    Bad Data

    Sun, 02/13/2011 - 20:38 EDT - Baseline Scenario - The Blog
    • climate change
    • commentary
    • Comments
    • crime
    • Education
    • environment
    • law school

    By James Kwak
    To make a vast generalization, we live in a society where quantitative data are becoming more and more important. Some of this is because of the vast increase in the availability of data, which is itself largely due to computers. Some is because of the vast increase in the capacity to process data, which is also largely due to computers. Think about Hans Rosling’s TED Talks, or the rise of sabermetrics (the “Moneyball” phenomenon) not only in baseball but in many other sports, or the importance of standardized testing scores in K-12 education, or Karl Rove’s usage of data mining to identify likely supporters, or the FiveThirtyEight revolution in electoral forecasting, or the quantification of the financial markets, or zillions of other examples. I believe one of my professors has written a book about this phenomenon.
    But this comes with a problem. The problem is that we do not currently collect and scrub good enough data to support this recent fascination with numbers, and on top of that our brains are not wired to understand data. And if you have a lot riding on bad data that is poorly understood, then people will distort the data or find other ways to game the system to their advantage.
    Readers of this blog will all be familiar with the phenomenon of rating subprime mortgage-backed securities and their structured offspring using data exclusively from a period of rising house prices — because those were the only data that were available. But the same issue crops up in many different stories covering different aspects of society.
    CompStat, an approach to policing that focuses on tracking detailed crime metrics, was widely credited with helping New York and other cities reduce crime in the 1990s. Last year, This American Life ran a story, based on a police officer’s secret recordings, detailing how in at least one precinct officers were pressured to boost their numbers through dubious arrests and citations. They also found another precinct where serious crimes were reported as less serious crimes in order to make their numbers look better than they really were.
    In a recent New York Times story, David Segal describes how law schools massage their metrics to score higher in the US News and World Report rankings. Segal focuses on the tricks that some schools seem to use to boost the number of graduates employed nine months after graduation; for example, some schools apparently hire their own graduates to temporary positions that happen to span the date on which employment rates are measured. The rankings are based on statistics that are defined by the American Bar Association but are self-reported by the schools and not audited by anyone.
    The big, well-known example of how the importance of data breeds data manipulation is standardized testing. In the early days of the standardized testing boom, the key statistic was the percentage of students at or above grade level, defined as the fiftieth percentile on some standardized test. (For those wondering if this is circular, the scaled score required to be at the fiftieth percentile is set before the test based on the attributes of the questions included in the test; it is not set after the test based on students’ actual performances.) So one obvious tactic would be to focus on students in roughly the thirtieth to sixtieth percentiles while ignoring the others. Another, more problematic tactic would be to classify as many low-performing students as possible into special education so that they would not be in the denominator. (Then there is blatant cheating, like giving your students more time to take the test or simply correcting their answers afterward — Freakonomics has a chapter on this – since few if any school districts have the capacity or the motivation to oversee the tests rigorously.) Even leaving aside data manipulation issues, there is also the basic problem that test difficulty varies from year to year. The test in year N + 1 is calibrated to be the same difficulty as the test in year N, but this is all based on statistics, and there is this thing called random variation to deal with.
    And I recently read Natalie Obiko Pearson’s story in Bloomberg on the problems with greenhouse gas emissions data. Most of the numbers we read are self-reported by countries and the companies in those countries, and even if they are honest (a big if) they are “bottom up” estimates — based on how much fossil fuel is being consumed. But when scientists actually measure changes in greenhouse gases in the atmosphere, they get different results than predicted by the bottom-up estimates. And in all the examples cited in Bloomberg, actual atmospheric measurements are higher than bottom-up estimates. This could be because the article didn’t mention atmospheric measurements that were lower than predicted by official data. But it could also be because both the companies burning the fossil fuels and the countries aggregating the data have the same incentive to underreport: companies because it means they don’t have to buy as many carbon permits and countries because it means they can claim to be under their Kyoto Protocol targets.
    Greenhouse gases are a good example of how we think data will help save us — if we can track how much carbon dioxide each company is producing, we can make it pay for that carbon — but we may just not have good enough data. In general, I think the current trend toward using more and more data is a good thing. I mean, what’s the alternative: gut intuition? But this only increases the importance of having good data to begin with. And when some parties benefit from bad data, this can be a big challenge with no easy solution.

    • Original article
    • Login or register to post comments
     

    Related

    • The Importance of Excel

      By James Kwak I spent the past two days at a financial regulation conference in Washington (where I saw more BlackBerries than I have seen in years—can’t lawyers and lobbyists afford decent phones?). In his remarks on the final panel, Frank Partnoy mentioned something I missed when it came out a few weeks ago: the role of Microsoft Excel in the “London Whale” trading debacle.

    • 2007 Deja Vu As Bond Issuers Game Rating Agencies Once Again

      With home prices rising at near-record paces in SoCal, corporate debt yields at record-lows, equity markets surging at near-record rates, and high quality assets dwindling by the minute under the heel of a central bank jack boot; it is perhaps no surprise that investors have switched from finding leverage through the balance sheet (i.e. crappy quality firms) to finding leverage through the instrument (i.e. structured credit).

    • Across Europe, political leaders have lost the trust of their people | Will Hutton

    • The Dividend Investors' Guide: Part XIV - The Promising Future Of Metals And Mining

      Back to Part XIII - Cable TV Industry By Mark Bern, CPA CFA

    • Moneyball: It's About Investing, Not Baseball

      By Efficient Alpha: By Joseph Hogue

    • Senator Kaufman Was Right – Our Financial System Has Become Dangerous

      By Simon Johnson Senator Ted Kaufman (D, DE) is best known these days for arguing that, as part of comprehensive financial reform efforts, our biggest banks need to be made smaller.  His advocacy on this issue helped build support around the country and forced a Senate floor vote on the Brown-Kaufman amendment, which was defeated 33-61 last Thursday.

    • Subprime Mortgage Securities Projected to Return Two Thirds of Their Original Principal

      Research Recap submits: Standard & Poor’s Ratings Services has released its most recent assessment of projected principal recoveries for residential mortgage-backed securities (RMBS) in prime, Alt-A, and subprime transactions. These show that in the most likely scenario, subprime securities would return two thirds of their original value, while Alt–Securities would return almost 80%.

    • How Good Is Arne Duncan’s Legacy

      I’ve seen both Ezra Klein and Cato’s David Boaz except this exact same paragraph about Chicago Public Schools under Arne Duncan:

    • Mortgage modification

      Yves Smith offers a very good critique of the mortgage modification plan.  Excerpt:

    Latest

    ‘I don’t think the matter is going to go away’: Deputy mayor calls Rob Ford’s statement on crack allegations a ‘good first step’
    ‘I don’t think the matter is going to go away’:...
    New book is a fuddle-duddle-seeking missile aimed at shattering the enduring Trudeau myth
    New book is a fuddle-duddle-seeking missile aimed...

    User login

    • Create new account
    • Request new password
    • Click on the icon to sign in with your social network login or enter your Bullfax.com login

    Our Blog

    • Tata Steel, ECB, China’s car market and European Corporate Tax in Our News for Today 05/24/2013
    • Pandora: the charm might fade away
    • Japanese Market, Indian Rupee, China’s Stocks and Oil Prices in Our Daily Round-Up for 05/23/2013

    Markets Map

    Markets Map

    Follow Us

    Follow Us on Facebook, Twitter, Google Plus and RSS LinkedIn Facebook Twitter Google Plus RSS
    S&P 500: 1649.60 -0.06% FTSE: 6654.34 -0.64% Nikk.: 14612.45 0.88% DAX: 8305.32 -0.56% HSI: 22618.67 -0.23% FX: EUR/GBP: 1.1694 USD/EUR: 1.2935 JPY/USD: 101.175 Commodities: Gold: 1386.60

    Bullfax.com - Market News & Analysis 2008-2011
    Contact Us | About Us | Terms & Conditions

    Follow Us on Facebook, Twitter, Google Plus and RSS LinkedIn Facebook Twitter Google Plus RSS .

    Secondary menu

    • Latest News
    • Top Rated
    • Most Popular
    • Archive
    • Discussions