The Life of A Hashtag

This topic may seem like one that wouldn’t be traditionally graphed, but in the category of data visualization I feel “The Life of A Hashtag” is a perfect example of the power of data visualization.

 

Taken from Visual.ly, “The Life of A Hashtag” is an interactive infographic that takes the hashtag that you input in real time and tells you multiple statistics about it. According to Google Dictionary, an infographic is “a visual image such as a chart or diagram used to represent information or data”. This infographic tells the viewer (referring to the past month) the number of tweets including the hashtag selected, the most influential tweet and the current most influential tweet to date as well as others. For the sake of this blog post, I decided to input the hashtag “#GoBlue” into the function input and was very interested with the results. So if you’re wondering what the life of “#GoBlue” looks like, it is something like this:Image

Interpreting the data, I found this infographic very effective. “The Life of A Hashtag” told me things about the hashtag “#GoBlue” that I would have never known, such as the most popular person to tweet it this month was @ParisLemon, who surprisingly has more followers than The University of Michigan Men’s Basketball Team. The infographic also has a line graph that shows the spikes in activity in the past month of “#GoBlue” and there is a large jump around February 23. This shows a limiting factor of the infographic because there is no explanation, other than common knowledge. By common knowledge, I simply mean that if the reader lives in Ann Arbor, or follows Michigan Men’s Basketball, they would know that the men’s team played, and beat, Michigan State University by 9 points with a final score of 79-70. 

My final thoughts on this infographic are that I believe this is a great tool for representing data. I really enjoyed the fun colors, texts and graphs, and specifically in this example I enjoyed the interactiveness. Although this is a great tool, all data visualization has its limits and nothing is 100% without text to help explain it.

Data Visualization: Fun Design without Distortion

Although charts and tables have a reputation for providing the most accurate representations of data, I think data visualizations have a lot to offer. Tufte warns us of the many all-to-easy ways we can go wrong and commit graphical fraud, which seems to be an especially easy trap to fall into when using data visualizations. There is definitely a standard that needs to be met in order for me to feel trusting of a graphic, but that stands true for any table or chart as well. Take this graphic for instance:

 Image

I pulled this image from a website called visual.ly, which invites anyone to use their tools to create their own data visualizations. The website intrigued me because I thought it might show how people who may not be as well-versed in data representations as Tufte or Booth go about creating graphics, and to what extent they apply Tufte’s ideas of “graphical integrity.”

At first glance, this image struck me as amusing. After a little consideration, I thought it was creative more than anything else. I commend the author of this graphic for making his/her numerical and descriptive data about alcohol into a “science” by formatting it as the periodic table, presenting me the facts of the data in an interesting and fun way while still, for the most part, encouraging me to take it seriously. 

What strikes me most, however, is the ease at which the exact same information could be put into a table (what we might think of as a “normal” table, at least, being as it’s already in a periodic table). The data – including the percent of alcohol in each drink, the flavor, and year of creation – is not skewed, as far as I can tell, by the manner in which it is presented, which is exactly what Tufte warns against. I considered that “The Periodic Table of Alcohol” might be implying some sort of hierarchy within the drinks that isn’t meant to be there, but what I found was that there WAS a hierarchy that was implied and it WAS meant to be there. Just as the Periodic Table of Elements is arranged within different groups according to atomic mass (don’t blame me if I’m wrong – science isn’t my forte), so too is this parody of the Periodic Table arranged within groups (Vodka, Beer, Wine…) according to alcohol content. So, while a little basic knowledge of the Periodic Table of Elements is helpful in understanding the format of this table, it isn’t mandatory.

All in all, I think this particular data visualization is committing none of the sins written about by Tufte. If anything, it is using graphic design to organize information in an intriguing way that follows a pattern of arrangement that wouldn’t have the same effect in any regular table. It is all of the fun design, accurate data, and clarity without any of the distortion. 

Effective Data Visualization

Much like charts and tables, data visualization can be both effective and ineffective. Its goal in the most basic sense is to present data, much like a table or a chart, but it attempts to do this in a more visually stimulating way than your standard bar graph or stripped down table. If the data is not particularly dense, it can be a much better way to show an idea. One of the examples of graphs we looked at showed the price of a barrel of oil and how it has gone up over the years. The price was represented by actual barrels of oil, and as the price went up, the barrel grew in size. This isn’t particularly the most effective way to show that price per barrel is rising, but it gives a general idea of it in an interesting way. If the article is less focused on the raw data and facts, and rather focuses on analysis or something about how culture or society has adjusted to oil price raises, this type of data presentation is very good.

Image

 

I pulled this map off a website called creativebloq.com and was under their article The 33 best tools for data visualization. This is a good example of how data visualization is actually very effective. A table could show which states each president came from, or which states produced the most presidents, a chart could do this as well. We’ve seen this done a hundred times and it’s just no longer exciting or interesting. For this picture, it shows a portrait of each president and which state he was born in. This is much more attention grabbing, obviously. But it also has it’s worth beyond aesthetics– it shows how concentrated the area from which most presidents come from is. Generally, they are all from the East Coast, with a few from Texas. One was born in California, and obviously Obama was born in Hawaii. It shows how the East Coast has monopolized the presidency and how the western US is not represented in the White House at all. 

That said, this type of visualization has it’s limit. If you asked where Franklin Pierce was born, you couldn’t find the info from this picture. You could say he was born on the East Coast, probably somewhere in New England, but something like this definitely fails at showing and highlighting the specifics that tables and graphs are far stronger at illustrating. For some essays, data visualization can present data into a visually stimulating representation. For an academic paper or something that isn’t for entertainment purposes, a table or chart will always be preferred. 

design x food

Google Images became overwhelming while I searched for data visualizations. There were so many different types of visualizations to choose from; some boring, some eye-catching and others very confusing. Perhaps I liked the image below because of its striking, complex, and also simple qualities. Or maybe I was just hungry. Nonetheless, I chose this image because I wasn’t quite sure if it would meet the common standards of Tufte, Miller, and others we’ve studied thus far.

design x food

It took some investigating to figure out what exactly this graphic is trying to show. I didn’t know if this was representing details of an average American’s breakfast, if it was about cereal specifically, I couldn’t tell. While it was taken out of context, I felt it was problematic that it took so much searching to determine what it was the image was trying to convey. I certainly couldn’t tell from the image alone.

What I found was a book created by designer Ryan MacEachern that represents his diet through graphics rather than just numbers. He presents pages of graphics along with a table detailing his food intake for the day, explaining that the book explores “the nutritional values of the diet and presents it in a contrasting way, it juxtaposes the dull and boring appearance of the food I was eating by presenting data using colourful vibrant foods, which were almost entirely excluded from my diet” (MacEachern).

While overall these images, and the book in its entirety, are very visually appealing, many of the visuals seem to overshadow MacEachern’s overall goal, something which Tufte warns against. Sure, showing the nutritional values with cereal is more interesting than with tuna, and is a direct representation of the juxtaposition he was aiming at, but it doesn’t really have anything to do with the data. He was actually eating tuna in some cases, not the sugary cereal. The point here doesn’t seem to be on the numbers, but rather the visual representation and juxtaposition of the different diets, the differences in vibrancy of his diets. The statistical qualities, which I assume were supposed to be the main point, are pushed to the background (literally, the tables are very difficult to read even in the book) and the images distort the numbers. While in the case of the image above, the use of different cereals is definitely interesting and colorful, I was more interested in the representation than the actual dietary information. I noticed the food more than the numbers, which wasn’t really the point.

That being said, his representations do have some strengths. By transforming his tables of nutritional values into a graphical representation (even though it may be a bit distracting), the numbers become more visually appealing to the reader. Also, the basic representation is simple. He was trying to show the percentage fat, protein, and carbs make up of a day’s worth of calories and, if the distracting visual was taken out, the viewer could clearly see the breakdown of the numbers, seeing how, for example, the 38% of protein made up 482k calories. The graph is focused, in that he takes the key qualities of the table that he wants to show to create a bigger picture. Rather than focusing on the nutritional values as separate entities, he combines them to show how they make up one larger characteristic of his diet (calories).

The problem here is that there’s an uneven balance of design and statistical knowledge, the design aspect clearly winning. Yes, the graphical representations are more striking than a simpler pie chart could be, but it is also a distraction. Frankly, he would have been better off with the tables included in his book and a simpler pie chart, as those are more focused. The goal of this compilation is very interesting, but I don’t think, unfortunately, that the data visualization meets it in an effective way.

Data Visualization: Wind

At first I was a bit confused about what Data Visualization was (ie. I had to email Prof. Modey and ask). After a quick search on Google Images though, I think it was just that I didn’t know that this type of representation was called ‘Data Visualization’. I’ve definitely seen these images before; really, I see them all of the time. It’s cool to put a name to this type of image.

Anyway… in my short research time I stumbled upon this image-

Image Courtesy of theregister.co.uk

 

First, I was struck by how beautiful the image was. There is something classic about a black and white image, and I was immediately drawn to the subtle elegance of the sweeping lines across it (even before I realized what the lines were supposed to be representative of). I really found it to be an ascetically appealing image. Looking closer, I was able to deduce that the lines were wind patterns. They look a lot like the images of wind seen on the news, and because of that association I think I was able to make the right guess. And looking outside my window into the blustery snowpocalypse of Ann Arbor, the weather was certainly on my mind. That, and the beauty of the image itself, were what first caught my attention.

Technically speaking, this Data Visualization has a lot of interesting strengths. In terms of Tufte’s graphical integrity, this Data Visualization seems to do a decent job. Dimensionally, the number of dimensions do not exceed the number of data types being represented–two dimensions; two datas [wind speed and direction] (Tufte 77). Similarly, the labels for this image seem good (Tufte 77). Although sparse, I feel like they are sufficient considering the visual aid of the wind representations. It looks like the Data Visualization might have also initially been interactive (allowing you to focus in on specific locations and get wind speed and direction). However, this was no longer an option with the copy of the image I found on Google. This interactive aspect could explain the lack of labels on the image I found (more labels would appear as you scrolled). The labels might also be lacking because the image does have an independent movement that make sense for its claims (Miller). The wind on the page is frozen in a movement aligned with the real world.

And in light of our most recent readings, it is important that I personally found that data here interesting. It is relevant to my current environment, and in bridging the gap between Topic-Question-Significance (Ch.4 Booth).

These strengths stated, there are a few significant flaws with this Data Visualization. Again, looking at Tufte, there is an emphasis here on design variation, not data variation (Tufte 77). Although the two are closely connected in Data Visualization, the data variation can be a bit ambiguous, and resultantly a bit subjective (which I find a real problem in terms of evaluating data). There could be a bit of reader bias here, which is something that Miller would tell us to avoid. Similarly, this bias could lead to distortions and misrepresentations, again things Miller warns us against when creating visualizations from data.

It seems that this Data Visualization captures the imaginative beauty that a good image should, but falls a bit short in terms of tangible/usable data. It creates a thought provoking image, however, this image is somewhat unclear and could lead to severe reader bias and misinterpretation. It did catch my eye though, and in the onslaught that is Google Images, I think that is still worth noting.

Surprise! It’s Table 8-2!

Piggybacking off the ideas of my classmates, I would also agree that Table 8-2 was very easy to interpret and understand.

Table 8-2

Table 8-2

Reading through the accompanying text by William J. Gilmore (276) his textual evidence helps to support the chart presented. The table presents the Composite Size of Family Libraries in the years 1787-1830 and it is clearly presented in a concise form by the chart presented. The chart shows that as the libraries increase in size, the number decreases and vice versa. The graph also tells the viewer that the majority of the libraries in that time period were between the sizes of 1-9 volumes as those groups make up a majority of the libraries. I would also like to note that this particular chart is limited by the range in which is selected, 1787 -1830. Why this time period? Why not start with 1780 or 1785? I feel that by presenting the data in this specific and detailed range, Gilmore is misrepresenting data in a way.

Looking at this chart, I’m not even sure if there needs to be a graph. I feel that all of the information presented in the chart effectively tells me about what libraries were in the time between 1787-1830. But, if I were to present this graph in a way that would represent what I had interpreted from combining the textual evidence with the chart, I would select a line graph. I would select a line graph because the line graph would show most efficiently the inverse relationship between volume size and libraries. I believe that by presenting the data this way, there is an easy visual representation of what I took from the chart and Gilmore’s text.

From Table to Graph

In William J. Gilmore’s Reading Becomes a Necessity of Life, tables and graphs are used to make sense of quantified data in a way that is intended to supplement Gilmore’s claims concerning the reading habits of rural American families from 1787 to 1830. One table in particular, Table 8-2, stood out to me because of a few characteristics: it is simple, easy to understand, and self-contained. 

Image

As Jane Miller wrote, effective tables are self-contained and well-labeled “so your audience can understand the information without reference to the text” [1], and I believe this is exactly what Gilmore accomplishes. The table’s header is compact without being too brief and includes a geographic location and time period. The columns are titled with a description and units (when applicable) and are not overly-crowded by lines or extraneous data. He includes a line for a total count at the bottom, which is a helpful feature so that readers can get a feeling for how many libraries are being accounted for. In his explanation, Gilmore groups different parts of the table together in order to make his point. For example, he writes “It is very surprising…that more than a tenth (11 percent) of all libraries contained more than twenty-five volumes…” It was easy for me, as a reader, to look at the table and notice that most of the libraries had only 1-5 books; however, Gilmore’s analysis drove my attention to what is apparently a more surprising observation. In this way, the table is easy to understand, but it is Gilmore’s written analysis that provides the bulk of information about what readers should be paying attention to. 

If Gilmore were to demonstrate this data in a graph rather than in a table, I find that there are two possible ways he could go about doing it. The choice of what type of graph to use, of course, depends upon the purpose of what the information will convey. Being as the data is not representative of a trend over time, a line graph is out of the question. A bar or pie chart would be a better method. For example, a bar chart might be used to show how many libraries have a certain number of books (i.e. 90 libraries have only 1 book), with the y-axis measuring the number of libraries and the x-axis divided into increments of 1 volume, 2-3 volumes, 4-5 volumes, and so forth. This, however, would eliminate the “Percent of All Libraries” information that the table provides which Gilmore uses in his analysis. It is possible to label each of the “bars” of the graph according to the percentage that they represent, as long as it doesn’t overcrowd the graph and confuse the data. This, however, might not be as helpful as a pie chart, which could create a visual of, for example, the 90 libraries that have 1 book compared to what all of the other 306 libraries have. 

In this particular case, I am convinced of the effectiveness of Gilmore’s original table to the point where I probably would not change to a graphical representation at all. His table is clean and easy to understand; if after experimentation with a bar or pie graph this same simple display of data is not achieved, I would be inclined to keep the table instead. 

 

[1] Miller 1840 (ebook edition) 

Making Sense of Data

A table is the most simple and concise way for data to be compiled into an image. If it does its job correctly, it will quickly present variables and their values and illustrate what point is being made in the surrounding paragraphs far more effectively than words could. In the Gillmore reading, tables are used effectively to illustrate points throughout the text. One of the more complicated tables is shown below.

Image

At first glance, this table seems to be presenting too much information. It is hard to make sense of all the values and what they mean, why does he continuously meander between percentages and sums? On a more in depth analysis, it becomes much easier to make sense of. As broad as the graph is, it is fairly easy to make sense of. He is dividing each family library by content, which expresses how many libraries there are and how many volumes there are. Because he gives us the amount of libraries in Windsor, and the total number of volumes there are, including percentages isn’t entirely necessary, but it is helpful. What I found most challenging about the graph was actually trying to make sense of the variables. He goes into little to no detail about what a “sacred intensive library” might include. He uses intensive and extensive to divide certain libraries, but doesn’t explain how he determined what books might be on the shelves on a secular intensive vs a secular extensive library. While he presented the data effectively, it was hard to make sense of a lot of it simply because he didn’t go into great lengths to explain how these two libraries would differ. He creates a category for Sacred Intensive-Extensive, but there is already one for Sacred Intensive. It doesn’t really hurt the table, because I can still make sense of what is being presented, but it was hard to understand each type of library makeup and eventually I became more fixated on trying to figure out what each library was rather than how big it was and how this compared to the rest of the libraries. 

To present this info on a graph wouldn’t be terribly difficult. You could use a histogram to show the levels for each library– the type of library would be on the x axis and the amount of libraries would be on the left y axis. For each type of library, a bar would be at the value for number of libraries. On this histogram, you could place a line to correspond with the right y axis that would show percent of all libraries. The same type of graph could be done with volumes. This explanation further proves that data is best shown visually. In this case, a table is preferred. There isn’t some overwhelming trend that needs to be shown here, so a graph would do its job, but not any better than a table would. In any case you would probably need two or three graphs to show all the data in the table, which makes comparisons of them more difficult. The table, though not pretty, most effectively shows exactly what Gillmore is trying to explain.

 

Table 8-2: An Evaluation

The table below is from page 276 of William J. Gilmore’s chapter, “Deep Structure and Rural New England Mentalités: Reading and the Family Circle, 1780-1835”.

Image

This table was the easiest for me to understand out of all the ones included in Gilmore’s chapter, as it is focused on only a few key points that give a picture of how large family libraries of the Windsor District were from 1787 to 1830. Unlike the other tables it is fairly self-contained. I tend to struggle when it comes to interpreting tables, but by looking at this table before reading an explanation, I could understand what Gilmore is trying to show the reader. He supplies us with a range of library sizes in volumes and then the number and percent of libraries that were that respective volume size. While in some of the other tables the use of both number and percent of libraries wasn’t completely necessary, and fairly confusing, the inclusion of both in this tables was very helpful. In including both, the reader gets the sense of how many libraries were made of a certain number of volumes while getting an idea of the percentage of libraries these libraries made up of the total.

Hopefully my explanation of this table makes sense. If not, it just goes to show how difficult it is to explain some data in words and how vital it is for an author to include tables and graphs in his or her writing. That’s why it was helpful that Gilmore included this table in his chapter rather than just explaining it. While he does give an overall explanation on the page prior to 276, it’s rather difficult to explain, which is why it’s useful to include this data, as it gives a more encompassing representation of his explanation.

Although a graphical representation of this data isn’t really necessary, as the table is pretty easy to understand and visualize, the best way to represent it graphically would be with a pie chart. In doing so, the reader could actually see the differences in library sizes. In the case of library size, it would make sense to represent the data with a pie chart rather than any other graph because it would give the reader a more visual representation of how much a certain size library makes up out of the total libraries. For example, we could see from a pie chart that most libraries have 2-3 volumes as it makes up the largest slice of the pie chart. While this can be seen in the table as well, a pie chart  would give a better visual representation of the data, which is the only reason this graphical representation may be better than the table. The only downside would be that the number of libraries would not be numerically represented. While the percentage is basically representing that number, it was interesting to see the number of libraries that had a certain number of volumes. Because of the ease in interpreting this data table in comparison to the others, the only benefit to supplying a graph would be so the reader could gain a more visual understanding of the data through a pie chart.

The task of evaluating and recommending how the data could be graphically represented was very helpful when thinking about the upcoming paper. I didn’t quite realize how important it is to select the best, most helpful representation of data, no matter if it is in a table or chart. It will definitely make me think harder when it comes to representing my data in my paper. I’ll have to evaluate and analyze my own data to determine if I’m including too much, too little, or just enough and if the way I represent it is the best way to do so. I could see the difficulty in understanding some of the other tables in Gilmore’s chapter, and how that hindered my understanding, so I am definitely feeling more confident when it comes to creating my own table and graphical representations of the data, because I see how important it actually is.

Table Analysis and Evaluation

In the article “Reading Becomes a Necessity of Life,” author William Gilmore uses both tables and graphs to effectively represent data in his text. On page 273, Gilmore uses a table to express the ‘Size of Windsor District Family Libraries’. However, I think he would have found more success in using a graph.

Screen Shot 2014-02-15 at 6.25.00 PM

Although the table is straight forward after a second or third read, it feels a bit cluttered. The table is divided in a number of ways. First, the table looks at the number of books present in libraries for three different year ranges: 1787-1800, 1801-15, and 1816-30. The library holdings are divided into clusters, ranging from 1 to 15+, and are given in both percentages and whole number form. Under all of this data there are also cumulative percentages based upon the same time frames. Again, these library holdings are divided into clusters, although here they are only listed by percentage. I feel like the complexity (and frankly lack of clarity) in this explanation of the table demonstrates is cumbersome nature. I’m having a hard time writing about it, because there is a lot of data and it is densely organized. There is a lot going on, and although I understand its material now, I was initially very confused about what the table was covering.

There are two things that confuse me most about this table. The first is that there are so many things being evaluated in this single chart. It lacks focus, looking at number of volumes, percentages, cumulative percentages, three timeframes, and many other divisions, all in a graph that does not even cover half a page of text. I think that the table would have been stronger if divided into two tables–one that focused on the ‘Number of Volumes in Library,’ and another that looked at the ‘Cumulative Percentages’ of the libraries. My second major issue with the table is that the data it represents is not uniform. For the ‘Number of Volumes in Library’ section the conclusions are given in both No. and %. However, these two numbers do not match up exactly because the percentages are not always out of 100%, which can skew the data. Although the data might be consistent in terms of the table itself, I found the data hard to compare. And for the ‘Cumulative Percentages’ section, the conclusions are only listed as percentages, however, these percentages are not clear at first read. The divisions compound on each other. I think that it is neat to be able to look at the table and see that between 1787-1800 43% of libraries had Over 3 Volumes, but again I don’t find these numbers particularly comparable. I think that the biggest flaw here is that too many types of data are being calculated in the table.

Wow, it’s really easy to critique someone else’s table, but I’m sure that I won’t feel the same way when I have to turn my own into a graph. However, I do think that this table could be a lot stronger if graphically represented. First, I would start by breaking the table into two sections. I would leave the ‘Cumulative Percentages’ information as a table because I can’t really imagine how it could be represented graphically. At first I was thinking of a bar graph, but I think that with how the data is separated the graph would be misleading. The divisions are so dependent on each other that I think comparing them as bars would be strange. But I would make the ‘Number of Volumes in Library’ section into a bar graph. I think that it would work best if each year range was its own color bar, and the horizontal axis was separated by the number divisions given in the table. I would also get rid of the percentage representation, because again I think that these values can be misleading because they are not always out of 100%. I would stick with the whole number representations, because I think that these numbers lend themselves well to a bar graph. Ideally, this table would be turned into a graph much like the one we looked at toward the end of last class. Together, I think that this more simplified table and the new graph would explain the information in the original table more effectively.

Talking about tables and graphs was a lot harder than I had anticipated. I was genuinely confused on how to convert some of what I was seeing into words, so sorry if my post was a bit scattered. That said, I am excited to have my own try at making a graph for paper 2. I guess we will see how it goes.