(US & Canada) Alaa Moussawi, Chief Data Scientist at New York City Council, speaks with Jack Berkowitz, Chief Data Officer at Securiti, in a video interview about his role at the council, the importance of making data accessible and the role of an open data portal, the impact of GenAI, and how data-backed evidence drives change.
The role of data in legislation: "We strive to make the legislative process fact-based and evidence-driven, letting the data speak in the least biased way possible. Communication is key — using visualizations such as dashboards and maps enables us to distill complex analyses for policymakers and the community."
The importance of open data: "Around 70-80% of the data we utilize comes from the open data portal, which is fundamental for transparency. By making 'little data' accessible, we can perform statistical analyses to evaluate program efficacy and measure equity in New York City. This transparency empowers both elected officials and constituents to understand what's happening on the ground."
The future of Generative AI in public service: "While there are risks associated with AI, the potential benefits far outweigh them. By leveraging technology like retrieval-augmented generation, our team can enhance efficiency and accessibility in responding to constituents, allowing us to focus on meaningful legislative work."
As the Chief Data Scientist at the New York City Council, Moussawi wears two functional hats, one of which is running a team of data scientists that he built at the council. Over the past six years, he has also established and has been leading a team of software and data engineers.
Elaborating on the purpose of both teams, Moussawi states that the two are fairly distinct. The team of data scientists works to make the legislative process fact-based and evidence-driven. The goal is to let the data speak in the least biased way possible and help policymakers, elected officials, and analyst councils understand the ground scene in New York City, says Moussawi.
He also believes that communication is key, especially when the technical findings need to be delivered to a non-scientific audience. Therefore, data scientists rely heavily on visualizations such as dashboards, charts, or maps.
Moving on to the software and data engineering space, Moussawi mentions that the team builds the entire infrastructure internally for much of the data processes that go on within the legislative division and other parts of the council.
Delving deeper, he discusses building a system that tracks the legislative process since its inception as an idea. For instance, if a council member or elected official has an idea for a legislative piece, they submit it to the software, which then timestamps it.
Next, from the process of having policy analysts and attorneys work on bringing that idea for legislation to fruition, to passing it as a bill is entirely managed by the software. Moussawi also mentions that the software and data engineering team has built a CRM that enables district offices to track and manage the casework when they liaise with their constituents.
Emphasizing data accessibility, Moussawi highlights New York City as one of the first cities to create an open data law. The law mandated that all “public” data must be made freely available on a single web portal.
Expanding on it, Moussawi states that the open data portal is similar to a data lakehouse. He adds that it makes the data accessible to people who may not have the experience of using data processing software.
In continuation, Moussawi maintains that over 80% of the data used by the data team is sourced from the open data portal, which is constantly expanding. He notes that the council is responsibly refining the legislation requirements and creating new ones to ensure that the relevant data needed for administrative purposes is available.
In addition, the open data portal has tools that allow visualization of the information directly, making it a one-stop shop for doing basic data analysis.
Shedding light on the impact of GenAI, Moussawi shares that the council is working to implement GenAI and is in its early stages. In this scenario, it becomes critical to limit the scope of GenAI for best utilization and keep it from going the wrong way.
The retrieval-augmented generation models do exactly that, says Moussawi. With RAG models, one can create a database of information, and through prompt engineering, one can ask the models to only query information from that database.
Moreover, the models can replace human labor for limited-scope applications, which could not be done through machines earlier.
Thereafter, Moussawi shares an example of a council project that utilizes Retrieval-Augmented Generation (RAG) models. He says that there is a massive collection of legal memos written by staff that serves as a repository of information related to various cases.
Now, it is crucial for lawyers to ensure that the current findings do not contradict those made in the past. However, it becomes challenging to deal with files with such unstructured data.
With RAG models, Moussawi affirms developing a system where one can query all of this data, allowing the machine to read, process it with an LLM, and respond to questions in real time based on the needs of an analyst or attorney.
This system has significantly reduced the time spent on multiple tasks, preventing cognitive overload and improving focus, says Moussawi. Consequently, the end-user produces solid work in less time, ensuring no critical information is missed.
Furthermore, Moussawi states that while the elected officials are in charge of decision-making, it is his job to provide them with the most accurate data-driven information. For instance, he mentions conducting an annual pay disparity analysis since 2020, and the data has consistently shown occupational segregation.
In conclusion, Moussawi says that the solid data-backed evidence enabled them to push for legislation aimed at addressing the root causes of occupational segregation.
CDO Magazine appreciates Alaa Moussawi for sharing his insights with our global community.