In one sense it’s news to celebrate: Since the passage of New York City’s landmark open data law in March 2012, city agencies have unleashed more than 1,100 files revealing the workings and impact of local government, via New York City’s data portal.
Yet hundreds more datasets that by law should have been released by March 2013 are still missing — a sign of just how challenging it will be to meet the law’s goal of opening nearly all city data to public view and reuse.
The tension between appreciation and frustration ran throughout a packed hearing on Wednesday of the City Council’s Committee on Technology, checking in on the progress of the city’s Department of Information Technology and Telecommunications (DoITT).
New York City Chief Analytics Officer Michael Flowers, a hero in the civic hacker community, has played a pivotal role in liberating valuable datasets from disparate city agencies and adding them to the city’s open data portal.
Flowers cited wins such as the public release of the massive geographic data trove PLUTO, previously hidden by the Department of City Planning behind a restrictive paywall and a licensing agreement that prevented sharing any part of the data or putting it online. But he also acknowledged that extracting raw data from myriad platforms, formats and agencies poses big challenges.
All data that agencies already make available through their own websites, whatever the format, were to be available in machine-readable format through the open data portal by March. To date, only about 70 percent of those datasets are up, Flowers reported.
“I don’t know if we will ever get to 100,” Flowers said when asked when he anticipated the data-uploading project would be 100 percent complete. “We will always be updating…. I just don’t feel comfortable putting a date on it.”
To help users track planned release dates and available datasets, DoITT released a open data dashboard for the NYC Open Data portal on Wednesday. The dashboard, a work in progress, currently tracks how far each city agency and business improvement district has come in publicly releasing its first round of data.
Testimony from civic hackers and activists overwhelmingly supported the law but also argued that there is a mismatch between what the public wants to see and what the open data portal currently makes accessible.
John Kaehny, executive director of the government accountability group Reinvent Albany, advocated for moving data of the greatest public interest to the top of DoITT’s priority list. He contended that the Department of Education and NYPD have been “dragging their feet” when it comes to handing over readily available data.
“In some cases it is technical difficulties,” said Kaehny, “but in some cases it is that the agencies want control.”
Kaehny also urged a “one strike you’re in” policy, in which data requested under New York’s Freedom of Information Law would be automatically added to the open data portal.
The open data law is intended to make any dataset that is subject to the Freedom of Information Act available for public consumption through one central repository. Agencies must upload all such data to the portal by the end of 2018.
Advocates and hackers testified that the pace at which data is being translated from inaccessible PDFs and customized reports into machine-readable formats, such as comma-separated value files, has been frustrating.
“Make the data something we can use,” advised Juan Martinez, general counsel of the safe-streets advocacy group Transportation Alternatives.
City Council Technology Committee chairman Fernando Cabrera seemed to sympathize with the frustration. “You want this data to be available in its purest form,” he said to Martinez.
Kaehny exhorted the council itself to step up and help bulk up the data portal to full strength. “The City Council needs to use this law,” he urged, suggesting that members and their staff were in a position to apply pressure to city agencies. “Part of this law is empowering the council.”
Yet the council itself appears to be be part of the problem. The dashboard shows the New York City Council intends to post four datasets on the portal: council members, committee hearings, legislative items, and land use items. Even though three of those datasets are available on the council’s website now, they will not be available through the portal until 2015 and the fourth — detailing its crucial role in approving zoning and other land use changes — will not be posted until the bitter end, in December 2018.