Looking for info on carbon emissions by activity

Hello internet – I’m doing one of those requests  for help again. I pasted this to Facebook friends recently, but it seems worth putting here on my blog as well:

Okay facebook friends, I need your help.

Y’know how companies and orgs have CSR reports where they list their sources of emissions, and roughly what proportions of these emissions their activities represent?

I’m looking for stats like this on:

a) public sector bodies that employ lots of people but are primarily office based
b) service based companies or high tech companies that sell digital products

I’m asking this as even now, I couldn’t tell you where I think the biggest sources of emissions in the majority of service-based organisations are, and have the data to back my reckons, and I don’t think that many of you can either.

If we don’t have this, how can we know we’re being effective?

If you can share this with me, I’ll start making viz and graphics, so at a glance we can have a more informed convo about this. I started doing this when I read drawdown recently to help me understand it better, and I want to do the same for something closer to home:

Yes, I know every company is unique.

But there will be patterns. We commute. We spend energy keeping people mostly warm and dry so they can be effective. We often have high paid people travelling quickly around the world.

Sure I know people who have access to this information?

Please share if you can – I’ve looked around and I’ve failed so far to find any thing usable at the organisation level.

More notes

I’m aware of scoped emissions according to the GHG Corporate Standard, and I’m aware you can infer some kinds of activity from the distribution of an organisations’ emissions among scopes 1, 2, and 3.

This was one of the key ideas behind the stuff we were building at AMEE (Avoid Mass Exinction Engine), when I was as developer and product manager there.

The thing I’m looking for in particular is the kind of activity – what’s the mean percentage for air travel, or office use and so on?

Is this data published in any aggregate form? That¬†there is the thing I’m trying to understand.


How much of the web runs on renewables today?

As part of the work I’m doing on the Planet Friendly Web, I’m trying to get access to data that I can base the guide on. In some cases this involves creating datasets from existing data. Here I share some findings from a dataset I generated along the way.

For example, to get a figure on how much of the web runs on renewable power, I started with a dataset of the top 1 million domains by traffic from Alexa.com, then run the list against the Green Web Foundation’s own API, which maintains a list of which domains run on renewable power.

To do this, involves making something like 100k API requests, so I created a screenscraper to carry out the job, and take care of retries, failed requests and so on. You can see it here on github.

I’ve uploaded the dataset created to datbase, partly as an experiment in making it available in a decentralised way, but also partly try out the workflow for publishing data.

So, now we have some data, let’s see what we can do with it, right?

Doing some analysis and some interesting findings

I have an earlier exploration of the data in a notebook on github, but when working with this data, I ‘m bit embarrassed to say I forgot how to use the Dataframe filters to slice the data quickly.

So instead, I’ve used Open Refine. You could probably store this in a Google spreadsheet too, as 100k rows is big, not but THAT big.

Anyway, what do we see?

There’s a few interesting findings just from faceting data like below in Openrefine,¬† and sorting by count along a few dimensions:

Screen Shot 2018-05-15 at 12.07.36.png

If you’re not familiar with OpenRefine, I’ll summarise what’s visible in this view:

  • Youtube.com is now more popular than google.com. Who knew?
  • The top three websites in the world run on renewable power. Huzzah!
  • Based on the greenweb foundation’s data, around 7% of the web the most popular domains on the net run on renewable power.
  • Hetzner AG, a German hosting company hosts more domains running on green power than Google does.
  • Amazon doesn’t appear here at all as a green provider.

After a slow start, I understood Amazon to be a HUGE player here, and while they have a nice shiny page showing off their windfarms and how much renewable power they use , they also run a load of their servers on coal. That they don’t appear may be an artefact of the Green Web Foundation going by an organisation’s entire power mix, to decide whether a company is running on green power or not.

I think need to check with Rene at the Green Web Foundation to see.

Fancy playing too? Come hang out on slack

This shows some pretty superficial analysis, but there’s already some interesting nuggets here.

If working with this data sounds interesting to you, let me know in the comments – I’m looking for collaborators on the Planet Friendly Web Guide.

Alternatively, come hang out in the sustainableux.com slack channel, where there’s a nice little community growing around sustainable web design.

If you prefer email

It turns out there’s a W3C Sustainable web design group. Here’s my post to the mailing list, if you’d prefer to communicate there via email.