What next?

We’v been working on the Breathe2 air pollution (PM) devices for more than a year now. The status so far is there are 5 working devices spread over Pune, continuously (almost) pumping Particulate Matter (PM) data online ever 5-10 minutes from the beginning of October 2019. These devices contain a PM sensor, a relative humidity and temperature sensor, a SIM module using 2G network to communicate with the internet and a microcontroller to manage all, all in a plastic housing with a fan inbuilt for continuous air circulation.

As an assessment of that work, one thing is sure – we have built ourselves a reliable platform that has been working almost without hitches so long 🙂 And all things look OK on the surface except we don’t have answers to the following questions:

  1. Do the devices perform as well as reputed but costlier government devices?
  2. What patterns can we observe as to how those spots (where the devices have been operating) behave over the past months?
  3. Are there any inter-connects between the sensors? Can these devices be taken as indicating of how an overall city breathes over time?

Even if we pursue and get some reasonable clarity on the above questions and evaluations, there seems to be a whole Pandora’s box out there to be faced with. Before i move on to keeping on the table Mrs. Pandora’s questions, there were some pertinent questions asked :

What is the vision here? What are we doing? Why we did start? Where do we wish to go? What’s the vested interest inhere? It would be best to clarify as much as possible here.

Origin: Frankly the origin of the idea to get into all this was my desire to get my hands dirty making an instrument. I always dreamed of getting into scientific instruments of some kind so by some chance encounters of news articles and so on and some free time at hand, we got in. We meaning Abhijeet and myself. We were inspired by the work of the data journalism house IndiaSpend (indiaspend.com) and its director Mr. Govindraj Ethiraj. They had pioneered the low-cost air-pollution networked system in Delhi in 2015. The team expanded and changed over time, and i write all this on behalf of the team – Abhijeet, Sumithra and subir (me) with Mayuresh pitching in at times. This is a collaborative effort between distributed individuals, i being nearest to the ground zero and the one holding the pen (or better known now a days as the ‘keyboard’).

Vision: In gist To make a low-cost opensource platform for measuring air pollution in a distributed networked format.

Vested interest: Very important question and that which every technologist must answer! But here i am speaking mostly for myself, my team-mates’ views may overlap or differ from these –

  • Agenda #1 – Have fun making things.
  • Agenda #2 – Have some meaningful fun, i mean who doesn’t want to be of some use to the society?
  • Agenda #3 – Get social credits, i.e. get chance to earn/raise in socially attributed personal-value (Should i have shame for harboring this most dreadful of the weaknesses ?).
  • Agenda #4 – Meet like minded people and work together ! How else would i have the chance to engage with so many interesting and valuable friends, beginning from my teammates ?
  • Some may argue that since air pollution is a new big thing with the new big fad of environmental awareness and sensors coming in cheap/easy = big market and all that, i may be having a hidden agenda / pursuit to make some nice money out of a sorry social crisis. I wish i had the brains to do so. To them i beg that they pray to their gods to gift me some business sense, so that i may at the least save some money, if not make it!

But why (seriously technical Q)? Coming back to the basics, i repeat here what Sumithra and i wrote for a research project proposal:

  • Government sensors are the best but too costly for commoners to buy.
    • Being too costly they are sparsely located.
    • The data they output to the public may not be scientifically analyzable by the public and who wants to get into a government department and fight it out to get tax payer’s rightful access? Not me!
    • Air pollution being highly local phenomena with multiple factors such as wind speed and direction, location of pollution sources in the vicinity or away from the sensor, geography (low/high altitude), weather conditions of wind/rain/humidity..etc etc all affecting in unison, it would be hard to determine how representative of the local region the government devices are. The alternative is to average out in time, but then the outcomes become huge averages that loose out on local day/time patterns!
  • Enter low-cost sensors and especially the idea of making an opensource platform using these sensors. Advantages:
    • Can be deployed in 100s if not in 1000s. Each city could have a regular grid placed sensors to get local trends (High spacial resolution).
    • Due to increased device density, huge 24h averages need not be the limits of data as in the case of sparse government sensors. Fine granularity could be achieved (High temporal resolution).
    • Data can be made available to public in raw and processed form for sections of public – local, state, national and international- to scientifically analyze data for any understanding, without asking for permissions.
    • Since the opensource platform essentially needs to be crowd-funded, people’s participation could get a boost and so could awareness. More, diverse and better device designs and strategies can evolve, overcoming by the by-design narrow interests of the government or private players.
  • But here are some issues with them:
    • They are compromised versions of good instruments. Meaning they use measurement techniques that compromise on measurement quality so as to reduce the costs.
    • There is some resemblance to what proper and best instruments would measure, but there is no guarantee of this.
    • Any scientific measurement instrument requires regular calibrations. That is not affordable for these sensors because of the idea behind using such sensors – keeping costs low.
    • Their inner workings are protected by the manufacturers and one can almost treat them as blackboxes. This could change if the sensor design itself is opensourced.
    • These sensors are meant to function reliably at known conditions specified by the manufacturers, but ambient air conditions continuously change, changing the performance of the sensors. There is no control over this significant aspect of this low-cost sensor domain.

So what should be ideal air pollution monitoring device? In my limited experience here are some notes:

  • An opensource air pollution basic sensor that clearly and transparently exposes the algorithms it uses, the assumptions it makes and the reasons behind these descisions – so that these fundamentals may be improved upon and also that these could help in proper interpretation of data.
  • There are many pollutants – PM2.5, PM10, CO, CO2, NOx, SO2, O3 and VOCs. All must be measured since we all now live in semi-industrial settings where all these are prevalent.
  • There must be some form of comparing the device with standard high quality instruments, atleast in some statistical way. Say out of a batch of 100 opensource devices made, 1 is compared to a standard lab device and the resulting calibration factors are then implemented in all the 100 devices. And this is repeated at least once every year or something like that.
  • The air that is sampled is adequately conditioned to meet consistently a desired temperature and humidity and volume (normal/standard volume), so that all values are comparable between devices and also in time.
  • The data that is sent to the cloud must be retained at all costs for years to come and ensured that its free for anyone to study and quote.

The challenges in creating the above ideal device could be:

  1. The above steps will surely increase the costs of the original simple un-calibrated, un-conditioned devices.
  2. Its development and deployment will take much time, effort and skills leading to more chance of this being a privately funded enterprise’s product than the ideal of a publicly crowd-sourced movement.
  3. Who’s going to do all the calibration and maintenance? Citizen engineers?

But why am i against private companies pursuing the above goals? I am not. I actually feel private companies could do the above job more sincerely and regularly because they would have to stand for it in public scrutiny if not legal scrutiny. But i have the opensource bug in me, so can’t help looking in a biased direction. Anyways, who says opensource based businesses can’t exist? See RedHat and Ubuntu/Canonical?

All the above is fine. But there are many more unanswered questions here:

  • Technical questions:
    • What to do with the data?
    • How to convert data into relevant information?
    • Can pollutant source be located with such a network? Either geographically or even in sub-species. This is called source-apportionment in air-pollution geek-o-logy.
    • Is there a good way to place the sensors or just randomly, and as many as possible? These devices don’t come cheap, so Sumithra proposed to study if there’s an optimal way to strategically place the sensors across a city and also monitor using sensors on criss-crossing city vehicles.
  • Social questions:
    • Relevant to whom? Who would want to get this data anyway? (Thanks Abhijeet for asking and maintaining this question).
    • If we are not doing all this for end-of-day measurement of medical impact of pollution then what’s the point. And how to measure this impact? Is there any way at all thanks to the huge privacy barrier in the medical industry? (Maybe in collaboration with Aditi Dimri’s/Rasika Lokhande’s health monitoring work?)
    • Can this data be used by advocacy groups with pressure government to act? Will the government not question the data’s lack of calibration? (example is the HIRWA group which successfully pressured local administration to act against waste burning using such low cost air monitoring sensor data – news article link)
    • With power comes responsibility, which humans have a shoddy track record of. So here are some potential negative consequence in the hypothetical case when the above air pollution monitoring campaign is successful (Thanks Sumithra for thinking and opening up this topic to further thought) :
      • Suppose large numbers of these sensors are deployed, but only a handful of scientists working on them. What if a scientist turns rogue, and predicts doom or bliss when the actual pollution state of the city is otherwise?
      • Can powerful organizations (government, corporations, etc) in anyway misuse the data to subdue public interest?
      • Suppose a segregated city (race/religion wise) is mapped, can these sensors be misused as propaganda tool to stereotype communities ?

Along with the above, i am sure many more questions/ideas/doubts exist. I was also lucky to be a part of a general ‘open-hour’ discussion which the kind and generous PublicLabs people hosted (Thanks Stevie and team) on how different interested people all over the world think about these low-cost air pollution monitoring movement. It was held on the 2nd of March 2020. I have not had to the time to analyze many of the questions and arguments that came about 1h30m long discussion, but it was great overall! Details in the above link.

So coming back to the earth and asking what should be the immediate plan, here are some pointers –

  • Verify the existing 5 Breathe2 devices to get:
    • Compare with MIT college’s SAFAR dataset and see how they fare.
    • Map patterns over time for all the sensors.
    • Make a small report.
  • Work on Sumithra’s idea :- How to optimally map a city though stationary and mobile low-cost air pollution sensors?
  • Investigate new device platform incorporating more pollutant measurements and input air sample conditioning. (Abhijeet’s help needed here).

The broader Qs need to be dealt with as and when the brains and pockets grow.