Doing Digital Humanities

When I initially proposed my idea for what would become “Designs by Mannix,” I felt vastly underqualified to do it, but I knew Gordon’s drawings would probably remain unseen otherwise. I was not an artist, art student, or an art history major. How would I know what constituted an A-line skirt? Could I describe Gordon’s drawings the way that they deserved to be described? In fact, the descriptions turned out the be one of the easier parts.

As I worked on the project, I struggled with what kind of information I should add beyond the drawings themselves: should I include thoughts that some of these female designs appeared to resemble Katharine Hepburn? Should I point out which ones were my favorite? Should I shamelessly ask for someone to commission a physical design of these styles? I tried to remain neutral, but when I received feedback, I discovered that my opinion had to matter, since Gordon’s–apart from the notes he made on some of his drawings–was lost. I had to explain what these drawings meant to me, to give others a reason why I cared so much for them. I also endeavored to study the drawings more, as I was missing essential details like dates.

In the end, it was a lot of tedious work: lots of filling out forms in Dublin Core in Omeka, lots of zooming in on Gordon’s handwriting to determine whether that was an “r” or an “n.” I regretted that I couldn’t take better photos with measurements and lighting without shadows, but I can just be grateful that I had brought my nice camera to take these photos initially. Admittedly, I left out some wonderful works of art because the writing was indecipherable or I couldn’t fathom how to describe it. Time constraints also prevented me from uploading every piece of artwork of which I had a photo. However, this project doesn’t have to be over forever. I can always add to it!

Just as when I first saw Gordon’s art, I was drawn to Gordon’s fashion sketches. Rereading his biography, one of the reasons why I loved them so much became apparent: they featured styles from all around the world. Fashion was a positive outcome Gordon envisioned for the war. He wanted a postwar world in which we all borrowed from the best of one another’s cultures to create multicultural pieces of art. Even 70+ years later, that message isn’t exactly embraced by a lot of the United States population, and I marvel at Gordon’s ability to see the best in diversity, even when he lived in a world torn apart by war.

I wanted to share Gordon’s talent, and I hope I did. However, it appears that we can learn from Gordon. His fashion sketches were not just art: they were an expression of what he wanted the world to be like. I hope one day we can get to that world that Gordon predicted.

Social Media Strategy

Social media! This should be a cakewalk, right? After all, I am not only a Millennial (I cringe at the use of that word, I know) but also a former social media intern (yes, it’s a real job, Mom!) for two organizations. But now I have to craft my very own social media strategy identify target audiences for my project. Suddenly this isn’t as fun as simply sharing my favorite Mental Floss article. (In case you’re wondering, it’s this one.)

My project targets three groups: residents of my hometown, art lovers, and World War II history students.

I’ve probably explained it before, but I intend my project to be a gallery of photographs of artwork created by Gordon C. Mannix, a soldier from my hometown of Plainville, Connecticut, who died during the Battle for Normandy in 1944. Gordon was originally written about by a member of the 2013 “Price of Freedom” class, so I was unable to write a biography for him. Because my high school has an award named after him, I still wanted to learn more about him, so I’ve been in touch with his niece. I knew from Gordon’s biography that he was a very talented artist who was supposed to attend Parsons School of Design on a full scholarship–until he was drafted into the Army by Uncle Sam. Besides feeling devastated that Gordon’s life ended when he was 19–younger than I was when I took this class in 2016–I was deeply saddened by the idea that his artwork would never be seen, never be in a gallery. When I met with Gordon’s niece last year, she allowed me to see his beautiful sketches, and I knew that Gordon was even more talented than the complimentary descriptions.

The idea of showcasing his artwork germinated in my mind, but the art was old and drawn by someone who never had the chance to become famous. After going over potential ideas for final projects for this class, I decided that this project was the perfect chance to give Gordon’s work a chance to be seen by a larger audience. With his niece’s permission, I have taken photographs of his drawings. While they are not of the best quality, I do have documentation of almost all of his work, which provides multiple perspectives on Gordon’s interests.

I am the kind of person who would spam everyone’s inbox with a link to this online gallery, but I need to use a strategy that doesn’t get me blocked by everybody. Let me break it down group by group.

Group 1: Citizens of Plainville, CT

Since the reason Gordon and I are connected is because of my hometown of Plainville, I think it’s an appropriate target audience. While nobody in Washington, DC, knows what or where Plainville is, it’s not small enough for me to go door-to-door and tell everyone about this project. (Besides, that would kind of go against the spirit of the project, right?) Fortunately, that means it’s sizable enough where I might be able to get more traffic than expected, even if I just pull in 10% of the town. (I’m being very optimistic, but oh well.)

Group 2: Art Lovers

Very broad, but since Gordon is an artist, and since most of my descriptions will focus around what’s in the drawings/watercolors/comics, I’m sure there will be an audience out there who is just focused on the aesthetics.

Group 3: World War II Historians/Enthusiasts

Though Gordon and I both lived in Plainville and went to the same high school, I discovered his story because of my class on the Battle for Normandy. Other students in my class would be interested in such a gallery, but I dare to dream a bit bigger and maybe get people who are just interested in World War II history in general, specifically on providing biographical details for the names in every WWII ABMC cemetery.

To reach these audiences, I plan to create new Facebook, Twitter, and Instagram pages. I don’t think a blog is necessary in this medium: the gallery serves the purpose of what a blog would. I could feature some of his art on this blog here, but I feel like that would do Gordon a disservice, to have his beautiful drawings next to my terrible puns and rambling thoughts. So no blog. No Pinterest, because I’m not selling his artwork–that would just be wrong.

Facebook would give me the most reach. It is also the easiest way to target people that I know personally who might be inclined to share this page with people who belong to my target audiences (as well as others!). Facebook has the most users, and of the three platforms I plan to use, it allows for the longest posts, allowing me to go into detail explaining his art.

Twitter may have the character limit, but it allows me to reach organizations that I would have no hope of reaching otherwise. Our local newspaper, The Plainville Citizen, is more likely to see a Tweet mentioning them than if I tagged them in a Facebook or Instagram post. It also has the farthest reach potential. Since most Facebook users have some privacy settings, the posts could just end up bouncing around the Plainville network, not reaching any other groups. On Twitter, things can go far because of the power of the Retweet. It is also easy to accumulate followers on Twitter, since users are quick to follow you back on this platform more so than Facebook.

Instagram is the most specific platform for this project. A platform that is photo-based seems to be a natural platform to showcase artwork. The only downside to Instagram is that links within posts aren’t available (yet), so users would have to go to the profile page and click the link in the bio, which can be a lot of steps for a person on the internet.

Since a lot of Gordon’s art consists of fashion sketches, I’m thinking–especially on Instagram–of using the hashtag #OOTD as the template for my messages. I can describe the outfits that Gordon has designed, much as an Instagram celebrity touts what she is wearing that day, from hat to shoes to everything in between. Twitter will probably follow in the same vein, but messages cannot go into detail. They’ll contain only the barebones of details of an outfit: hat, jacket, pants, heels. It’s better than having me list out the color and style of everything: no one wants to read a 7-part Tweet if I’m not a celebrity. Remember: Twitter is more about reaching out to people rather than curating exceptional content. On Facebook, I’ll pitch the message of remembrance: this is how we can remember Gordon. Gordon was more than a mortarman; these drawings more accurately display who he was, what he was interested in. The end goal of all these messages is one thing: share, share, share! I’m not doing this to boost my SEO; I want Gordon’s drawings to be seen, to be appreciated. They have always been appreciated by friends and family, but I want to show new audiences his work, since that’s what he deserves.

I can’t possibly run these social media accounts forever; interest dies down eventually, especially when there is a limited amount of content to post. I think doing the bulk of the posting for a month after the gallery’s launch date will be the best time to draw people’s attention. As much as I’d like everyone to see these drawings, I don’t want to crash my website, and I’d also not like to set my expectations too high, as I know internet apathy can be just as strong as internet rage. If I can get to the point of having 50 followers on each platform, I would be very happy. (I have no idea if I’m setting these goals too low or too high, Dr. Robertson!) I don’t know if I can make these profiles into business accounts without, you know, being a business. Otherwise, I could use fancy analytics to measure my reach, but I think it’s best to stick with standard accounts and measure reach through followers. Another good indicator is if each post is shared at least twice. (I say twice since I know my mom will probably share it, and I feel like saying “once” would be cheating.)

I’m trying to combine what I’ve read recently with what I’ve learned in the past year and a half, but I hope this social media strategy works, as I think the goal of getting as many people as possible to see Gordon’s artwork is worth sending out a Tweet, posting a photo, or pasting a link–and much more.

What Can You Do With Crowdsourced Digitization?

At first, crowdsourcing seems like handing off tasks to people and not paying them, which doesn’t sound great at all. However, crowdsourcing is best described as a “it takes a village” effort. These tasks would be monotonous for an hourly worker, and mistakes are bound to pop up when the same person is doing the same thing every day. Add hundreds of others and suddenly you have proofreaders and fact-checkers, all doing this because it involves minimal effort. Of course, some users love to get into the weeds with various crowdsourcing projects–all for digital badges or silly stickers–but crowdsourcing works best when it’s portrayed as a quick activity.

The easiest crowdsourcing activity is corrections. I tried this out with NYPL’s Building Inspector, which involves a user checking building shapes or fixing them himself/herself. I like this project because you’re double-checking a computer, which is less daunting than some other crowdsourcing projects. The stakes aren’t high, and you can do a lot in a small amount of time. The motto “Kill Time; Make History” makes that point best. Now, this crowdsourcing is map- and history-focused, but if you want to contribute to modern-day maps, Google Maps also uses crowdsourcing to provide phone numbers for businesses, to add new locations when a new store opens up, to show you pictures of what a building or park looks like–all because circumspect (re: picky) people provide this information.

Transcriptions involve (much) more work. I’m not going to lie: transcriptions involve time and effort. A lot of their tutorials don’t involve 1920’s ragtime piano music and cheeky text: they’re full of rules, guidelines, instructions, etc. I reviewed Papers of the War Department, admittedly because I took a class with Professor Hamner this semester. (Ask me about civil-military relations sometime!) I also love the idea of recovering information that was thought to be lost, as well as the idea of this project being open to the public. I knew the interface wouldn’t be as fun as Building Inspector, but I did expect to be able to zoom in on the letter on the same webpage as the transcription text box. However, zooming in the letter involved opening the image in another tab and zooming in with the web browser’s zoom tool rather than one built into the interface. I hate switching back and forth between tabs when transcribing, so that piece irked me. There also isn’t instant gratification with transcription like with crowdsourced corrections: 18th-century cursive script is very hard to decipher, and the spelling of even erudite men was non-standardized and absolutely appalling to any English minor. However, that is not to say that I don’t like transcribing. While I need to practice reading 18th-century handwriting, I really enjoy transcribing videos on YouTube. Several YouTube communities invite viewers to transcribe their videos and upload captions. Transcription even goes a step farther with communities who also translate videos. This past summer, I transcribed and uploaded captions for several videos for the American Veterans Center’s YouTube channel. This allows people who are hard-of-hearing (which, let’s face it, is the channel’s main demographic) to understand interviews or narration, and it also allows me to improve the metadata, if there is an interesting subject I missed and need to tag. The AVC YouTube videos could benefit from crowdsourcing–I certainly wasn’t going to caption over a thousand videos, some of which are an hour long. So if you want to release your inner court stenographer, audio transcription could be the way to go.

I enjoy crowdsourcing, but I do fear that it contributes to the “gig economy.” If someone is doing a significant amount of work, I think that they should be paid. At the very least, they should be named as a contributor, especially if the work is published. Crowdsourcing does make “ownership” a bit blurry, and if the project creators and managers aren’t circumspect, crowdsourcing can do more harm than good if someone has to go through all the mistakes and fix them all by hand. All in all, crowdsourcing needs to be monitored; things usually will not run smoothly on their own.

How to Read a Wikipedia Article

Of course, the answer to “How to Read a Wikipedia Article” seems fairly obvious: just read it! However, as I discovered when I came across an article that described Rupert Grint from the Harry Potter movies as “totally going out with Emma Watson in real life,” the information on Wikipedia isn’t always very accurate. Fortunately, there is some accountability, and it’s easy to basically CTRL+Z a page when someone goes rogue on it, which is why you don’t often see profanity-laden articles about digital humanities. If you click “View history” in the upper-righthand corner of the page, you can suddenly see a log for all the changes made to a page. Time, date, and contributor are all listed, as well as the significance of the change. Was the contributor fixing a typo? Is the contributor a general Wikipedia editor who religiously monitors the pages, a person who knows a lot about the page he/she is editing (like the majority of the editors of the “Digital humanities” page), or is it a bot or a random person looking to stir up trouble (usernames like Cheryl27 seem an easy target here)? This log isn’t very intuitive, so the information listed first and foremost is most recent. If you want to see when the page was created, you have to click on the “oldest” link above the list. (That’s how I found out that the “Digital humanities” page was, in fact, created by a digital humanist, Elijah Meeks.) If you’re more of a visual person, check out the “Revision history statistics” to see tables and pie charts, as the bland list isn’t very compelling or easy to understand.  If you like comparisons, you can compare how the page looked yesterday with how it looked a few months ago, and see where the changes are. (Unfortunately, you can’t really compare a page today with how the page was when it was created–at least as far as I can see.) When you’re done trawling through edit history, it’s worth checking out the “Talk” link next to the “Article” link on the upper-lefthand corner of the article page. This page documents the controversies (and there can be surprisingly many), from misquotes to arguments over what quotations mean. You have editors justifying their actions (including Elijah Meeks’s “meh” approach to creating the “Digital humanities” page). People can submit ideas for page changes here, as merely changing a page yourself tends to lead to wrath from the misogynistic Wikipedia editors. (Internet editors hate when newbies encroach on their turf, and this bullying has been well-documented by reporters and insiders.) I see the Talk page as a comments or reviews section underneath a online newspaper article or product advertisement. The discourse does tend to take itself too seriously, but it serves as a warning before someone reads an article and blindly takes in the information. Of course, even I wouldn’t look at the Talk page every single time I checked out a Wikipedia page–especially when I’m just checking for plot summaries of musicals, since that’s what I do in my spare time–but for academic perusals, you definitely should. Citing Wikipedia is always a dangerous game; the librarians at my high school would remind of us this by telling the story of a student who wrote a paper on the faked moon landing based on faulty internet sources. However, the “References” and “Further reading” can lead you to primary and secondary sources that are scientifically or historically robust. For the “Digital humanities” page, the list of institutions can show you where to find well-known scholars and projects. In general, Wikipedia is good for summaries and overviews–when you have no idea what game theory is and only have the patience to read one sentence about it. But as with anything free and just lying around, be sure to figure out where it came from. 🙂

Comparing Tools

Although I admittedly have a preference for Palladio, I will not let my bias paint Voyant and CartoDB as terrible pieces of software. In fact, I think of these tools have their own strengths, are good in their own ways–and I’m not just saying that to avoid making a hard decision. Although all three are meant to highlight information not easily seen or understand within metadata or online archives, they all do completely different things. Voyant analyzes a body of text. It can compare one body of text to the entire corpus, but one has to draw his/her own conclusions from the graphs. I am a sucker for the n-gram search, as I think of it as an academic version of “Google Autofill,” but I digress. Voyant is good for words, words, words. CartoDB, just as its name suggests, is good for maps. If there are a lot of locations in the metadata, it would be a good idea to utilize CartoDB or some other mapping software. CartoDB shows you where things/people are or were. Therefore, it lends itself well using pictures, which Voyant clearly does not. However, just as without words Voyant is useless, CartoDB is useless without location. I enjoy CartoDB, particularly its animation map, but it is only useful in certain contexts. So I’m going to sound like a real estate agent and say CartoDB is all about location, location, location. Finally, my favorite child, Palladio–what more can I say about this? I love finding hidden relationships between sets, even more so when the sets don’t involve comparing real and rational numbers. Palladio solved my biggest pet peeve with CartoDB, which is that CartoDB–unless you put in extra layers and such–doesn’t show how one set relates to another. I really wanted to see which slaves came to move to Alabama from outside the state, but CartoDB couldn’t show me that: it could only show me dots–differently colored dots, but still. Palladio showed me the links, and it also made the nodes different sizes so I could so if there was a big influx of former slaves to a certain Alabama city. I could continue in my cheesy repetition and say that Palladio is all about relationships, (relationships, relationships!) but I see it more as a bridge between what Voyant specializes in and what CartoDB specializes in. This is apparent in the Mapping the Republic of Letters project: it bridges words and location, showing where letters were written and what those letters were about. Bringing all of those things together really makes my heart melt.

However, these software aren’t meant to compete: they’re better at complementing one another. Take biographies about fallen soldiers buried at Normandy American Cemetery (this is somewhat of a recurring motif throughout this blog): I could post all the biographies online and use Voyant to analyze which phrases appear the most. If all these soldiers were buried at the Colleville-sur-Mer, we could guess that words like “France” would appear often. However, Voyant could reveal that. say, a lot of them played baseball, based on the amount of times “baseball” appears in all the reports. Voyant also makes it clear which biographies contain the most information, as it gives the simple word count. Then we could see which biographies could have more gaps than others.

We could use CartoDB in this project for several things. If we want to focus on where all the soldiers came from, we just create points on a dot map, each point representing where each soldier was born. If we have the information about where each soldier died, we could map where they all died in France. It provides something more visual, which Voyant, even with the frequency charts, isn’t great at showing.

Lastly, we could use Palladio to bridge together some of the missing information. We could see lines connecting each soldier from where they were born to where they died. We could graph the relationship between what branch of the military they were a part of and where they died–perhaps indicating the different missions that airmen and soldiers had. We could even see if any of the soldiers could’ve crossed paths, based on the places mentioned in the biographies. Suddenly, new relationships are revealed through networking.

In short, a project can use all three of these tools, all without one overpowering the other. They each have a specialization, so why not use all of them?

Network Analysis with Palladio

I’ve always been intrigued with mapping and networks, given my background in applied mathematics. It’s refreshing to see, instead of mapping sets of real numbers to prime numbers (cue the groaning of all the non-STEM readers) this information being put to more practical use. (I hope my Axiomatic Set Theory professor doesn’t get annoyed I said that!) I really liked using Palladio to create a network, as I find networks inherently more interesting than plain maps. Maps are great for one-dimensional representations–to show relationships, however, I think it’s necessary to use networks.

We used the same data we’ve used for the past few exercises, which is metadata from interviews conducted by government employees with former slaves from 1936-1937. I found this exercise highlighted the most interesting revelations with the data. With mapping, I said that it mostly eliminates grunt work, but it doesn’t really reveal anything new. I guess the same could be true for network analysis, but I don’t think so. Human beings are terrible at grasping relationships between two sets. It’s why we need phrases like “correlation does not equal causation” but also why people can’t even believe correlation might mean something. (Cough, data relating to climate change leading to bad things.) Sifting through metadata may allow me to grasp a basic understanding of how big the Roman Empire was, but I can’t grasp that same understanding with the Republic of Letters, no matter how many times or how long I could stare at it. Therefore, the network analysis allowed me to see things I had never seen before, even with the text mining and the mapping.

We used Palladio for this exercise, which I found to be easy and intuitive. Although many of the projects we reviewed used Gephi, we used Palladio because Dr. Robertson instructed us to. Easy enough decision, then! The setup, I’ll admit, was a bit confusing, as it involved dragging and dropping .cvs files into the Palladio webpage. Then I had no idea what everything meant, but all that confusion was easily translated into networks when I hit the “graph” button. I have no idea how it translated all the metadata into a network–and I’m sure I’d have to get a degree in computer science to grasp it fully–but it translated the data beautifully, and it allowed me to choose relationships that I wanted to highlight. Just as with comparing any two sets of data, some were more useful than others. When both the source data and the target data contained large quantities, the network was too large to make any sense of. When one group was smaller, limited to say two or three categories, the relationships were plain to see. And relationships varied greatly. You could make a graph to show the relationship between where a particular former slave had been enslaved and where that same former slave was interviewed–effectively showing you where he/she was and where he/she is (at least when the interview was conducted). So that was a relationship across time, for the same person. But you could also graph relationships between people, for instance, between the interviewers and the interviewees. Which people interviewed former slaves the most? Were any former slaves interviewed twice? Suddenly these questions had easy-to-see answers. Then, to really get into what the former slaves talked about, you could graph what males talked about vs. what females talked about. However, just as Dr. Weingart said, just because you can network a relationship doesn’t mean you should. Some graphs were more visually informative than others, and others were just complete messes to both academics and laypeople. The relationships that best benefited from network analysis didn’t involve too many categories–even sorting what slaves talked about by age was a little too much. One-to-one is usually the best, but that doesn’t mean overlap isn’t important–in fact, overlap shows where common interests/places/people exist.

In conclusion, perhaps I loved this exercise just because I like networking, but even all the love of networking and graphing couldn’t make up for terrible software. I really liked Palladio, and I would gladly use it again.

Mapping with CartoDB

I’ve toyed with the idea of mapping before, but I thought mapping would involve lots of coding and headaches. Then I used CartoDB, and my life changed! (Sorry, this sounds like an infomercial.) Seriously, though, CartoDB is much, much more user-friendly than I expected. Although I had to follow instructions to create my multiple maps in CartoDB, I marveled at how easy it all was. I have to offer the caveat that I used a data set already provided to me, so I wasn’t manually entering any data laboriously. However, we’re talking about mapping, NOT metadata (thank goodness!), so let me begin.

The first type of map–the “dot map“–is fairly basic. There’s not much difference between this map and drawing dots on a physical map. However, you do get to select the background map, making it look slick and chic or going with an old-timey look. The background can convey a lot about what kind of data is being presented, so I liked this aspect of CartoDB. The other difference that makes this superior to a regular map and pen is the “hover” or “click” pop-ups, which, as the names suggest, allow information connected to each dot to appear whether you hover or click your cursor on it. I enjoyed the pop-ups because, when graphing in math class, I felt like legends were inadequate to convey all the information I wanted to. This solved that kind of problem perfectly.

The second type of map I tested out was the “animated map.” In this map, dots appeared and disappeared on the map to indicate when each interview was conducted. I really enjoyed this map, even if its information was very limited (i.e., the dates of the interviews). However, if this is the information you wanted, then you wouldn’t have to hover over dozens of dots to find out every date. Also, the animation provided a great insight into the frequency of the interviews across a period of almost a year, especially the months when the most interviews were conducted.

The third map I created was the “heat map,” which I definitely liked the least. I can imagine this kind of map would be useful if there are multiple data points in one town, say, as the dot map tends to let one data point cover up the others. However, since the interviews were relatively widespread, I don’t think this map was the best choice. I can see its uses, however, so don’t write it off. As always, you have to think about the information you want to convey, and the best medium for doing so.

We called the fourth map a “category map.” It’s a bit harder to describe this one, but basically this was a more interactive dot map. I was able to select widgets that allowed you to see the map through different categories: gender of the slave, name of the slave, where the slave was born, etc. True, this was all information available with the pop-ups on the dot map, but I enjoyed the idea of the visual presentation of certain information. Once again, it’s about accessibility of the information. It’s far easier to see dots of two colors and think, “Oh, there are the males and there are the females,” rather than clicking on every single dot. Plus, since I could only select a few widgets, I could focus the dots on the slaves, rather than on the interviewers. The animated map focused on when the interviews took place, which places emphasis on the project to document the slaves, not actually on the slaves themselves.

I suppose with the last map, the “layered map,” I was supposed to utilize all the information I previously learned about creating these maps. However, I couldn’t really use my two favorite types of map–animated and category–so I went the uncreative route and used two dot maps in different colors. The point of the layered map is to showcase multiple data sets on one map, so this map, for instance, shows where slave interviews were conducted and also where slaves were enslaved. However, I wished there were more of a relation between the two data sets. I did get to see that while the interviews were conducted in Alabama, the places people were enslaved were in Louisiana, Alabama, and Mississippi, to name a few. I suppose this might be a fault of mine for expecting too much of CartoDB and of the data sets (they might not have represented the same people, after all), but then I think that eliminates the spirit of a layered map. I wouldn’t put two unrelated data sets on the same map–that wouldn’t make sense. So I wanted to see more of a connection, and there wasn’t one.

Overall, CartoDB was very user-friendly, and it’s a great way to showcase data.

Text Analysis with Voyant

Voyant is an online tool (although it requires a download as well as a Java install for that cool Voyant 2.0) that allows you to see/map the frequency of words in a given document. Thankfully, you can paste .txt documents into Voyant, sparing you the need to copy the entire Declaration of Independence in one frustrating scroll. Also, you can copy multiple documents into Voyant, allowing wonderful compare/contrast exercises between the corpus–all of the works–and the individual documents themselves. The five main tools are Cirrus, Reader, Trends, Summary, and Contexts.  Cirrus is basically a fancy word for “Word Cloud”: it gives you a very visual and colorful representation of which words appear most in a given document. Reader, as its title suggests, allows you to read through a whole document–or a whole corpus, if you’re feeling really ambitious. However, it is most useful for allowing you to scroll over certain words to see how many times they appear in that particular document. Trends appeals to my math-loving side because it is a graph of how often a selected word (from the Cirrus tool) appears in the corpus. And if you want to get really fancy with Trends, you can see a graph for how often a word appears in just one document, so you can see if there are unusual spikes within one set even if there aren’t any present in the whole corpus. Summary is also self-explanatory, but it provides the numbers for everything. There’s nothing visual about summary; it’s all about words, words, words–and numbers, I suppose. If you don’t know every document’s length or vocabulary density, the Summary tool will figure it out in a pinch. However, Summary’s most useful category is Distinctive Words, which allows you to see which words appear in one document and no others–which means you don’t have to trawl through the Cirrus or the Frequency tools to see where the gigantic spikes are. Finally, the Contexts tool appeals to my writing-loving side, as it shows you all the surrounding words when it comes to the term/word selected. For instance, Cirrus or Trends couldn’t tell the difference between the nice Mrs. Burns or a fire that burns the whole town down. You could check through the Reader, but frankly, no one wants to do that. Contexts shows you, “Okay, this use of the word means the town was razed.” It stops you from jumping to crazy conclusions, which I am all for.

In conclusion, I find Voyant to be a very useful tool, and even one it could get easy to lose oneself in. However, a lot of Voyant isn’t very intuitive to use, especially features like getting not-so-frequent words to appear on the Trends graph, which involve selecting the word in various other places like a complicated game of leapfrog. Also, the exporting the information was the bane of my existence for this exercise, as exporting a Trends graph didn’t always export it as a Trends graph for “house” in Virginia, but for the whole corpus, which wasn’t selected in the first place! Oh well. Look at your work before you export, kids.

Why Metadata Matters

  • What features of the digital objects does the metadata describe?
  • What features does it not describe?
  • What questions does the metadata allow you to ask?
  • What questions does it not allow you to ask?

In a different approach to reviewing ARTstor, I will examine how well the database utilizes metadata. ARTstor’s metadata, helpfully, provides the basic facts about a piece: creator, title, and work type being the obvious three. It also provides the date that the piece was created, its measurements, its location, to which collection is belongs, its source, and its rights usage. All of this information is useful for classifying the piece, yes, but it is most important for the viewer. The one detail it includes that is probably for ARTstor’s benefit more than the person accessing the page is the ID Number.

Unfortunately, the ARTstor’s metadata doesn’t contain “subject,” which I consider to be a useful category. Maybe the subject of a painting is tedious and difficult to document, yes, but it’s one of the best ways to filter results. It’s also disappointing that the metadata only contains information about the analogue image–not the digital. So, for example, I don’t know who scanned it, or even if the image was scanned. If it’s a sculpture, I don’t know who took its photo. I wish that there were information on the strictly digital aspects of these images.

Based on these observations, I conclude that the metadata of ARTstor is descriptive, despite the categories it lacks. The metadata allows us to ask questions about the piece as it exists in the physical world: its creator, its location, its date. However, if I want to get meta (not metadata) about the image and its digital copy, there is no information there. Therefore, the metadata doesn’t allow me to ask questions about the digital image, the one I’m viewing on my screen. I think this distinction says something about what kind of information ARTstor values.

Reviewing Artstor

ARTstor is a image database.  ARTstor’s Digital Collection comes from over 280 collections from the United States, Europe, and Asia. Initially, when started in the 1990’s, the mission was provide an art-based database similar to JSTOR–so similar, in fact, the name borders on copyright infringement. The first pieces of art to be digitized included art that was in peril. Now, the collection has been able to expand and focus on expansion instead of just preservation of paintings in peril. It provides two options: basic search and advanced search. However, the advanced search comes with a plethora of options. You can search various keywords, and you have the option of adding “OR” or “NOT” in front of keywords. You can specify the search to a certain date range, geographic area, type, or database (full text, GMU database, etc.). All of the results link to pages with the image along with all its metadata. Unfortunately, I cannot find any information on how these items are made digital–photography? scanning?–so that leaves me in the dark, which is unfortunate, as I’ve been kicking around some ideas for my final project and could use this info… Anyway, ARTstor contains exportable as well as facsimile images–not digitized from microfilm, however.

If you’d like to look at other reviews other than my rambling one, it was rated Best Overall Database in 2013 by Library Journal, and Eunice Schroeder wrote a review for Notes: Quarterly Journal of the Music Library Association. 

Unfortunately, ARTstor is only available through subscription or libraries. George Mason University has proxy access, but if you want to check the availability for nearby libraries, look it up on WorldCat.

For information on citing images from ARTstor (i.e., digital images), check out the Purdue Online Writing Lab (OWL).