plotting a histogram on a Log scale with Matplotlib

I have a Pandas DataFrame that has the following values in a Series

x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7, 19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1]

I was instructed to plot two histograms in a Jupyter notebook with Python 3.6.

x.plot.hist(bins=8)
plt.show()

I chose 8 bins because that looked best to me. I have also been instructed to plot another histogram with the log of x.

x.plot.hist(bins=8)
plt.xscale('log')
plt.show()

This histogram looks TERRIBLE. Am I not doing something right? I've tried fiddling around with the plot, but everything I've tried just seems to make the histogram look even worse. Example:

x.plot(kind='hist', logx=True)

I was not given any instructions other than plot the log of X as a histogram.

For the record, I have imported pandas, numpy, and matplotlib and specified that the plot should be inline.

2

3 Answers

Specifying bins=8 in the hist call means that the range between the minimum and maximum value is divided equally into 8 bins. What is equal on a linear scale is distorted on a log scale.

What you could do is specify the bins of the histogram such that they are unequal in width in a way that would make them look equal on a logarithmic scale.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7, 19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1]
x = pd.Series(x)
# histogram on linear scale
plt.subplot(211)
hist, bins, _ = plt.hist(x, bins=8)
# histogram on log scale.
# Use non-equal bin sizes, such that they look equal on log scale.
logbins = np.logspace(np.log10(bins[0]),np.log10(bins[-1]),len(bins))
plt.subplot(212)
plt.hist(x, bins=logbins)
plt.xscale('log')
plt.show()

enter image description here

1

Here is one more solution without using a subplot or plotting two things in the same image.

import numpy as np
import matplotlib.pyplot as plt
def plot_loghist(x, bins): hist, bins = np.histogram(x, bins=bins) logbins = np.logspace(np.log10(bins[0]),np.log10(bins[-1]),len(bins)) plt.hist(x, bins=logbins) plt.xscale('log')
plot_loghist(np.random.rand(200), 10)

example hist plot

3

plot another histogram with the log of x.

is not the same as plotting x on the logarithmic scale. Plotting the logarithm of x would be

np.log(x).plot.hist(bins=8)
plt.show()

hist

The difference is that the values of x themselves were transformed: we are looking at their logarithm.

This is different from plotting on the logarithmic scale, where we keep x the same but change the way the horizontal axis is marked up (which squeezes the bars to the right, and stretches those to the left).

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like