Lab 7

Obono, Oliver de Bono

Definitely quite time consuming figuring out the various
systems and then presenting in a website.

Part 1

First, I used the ngram to see whether 'apple' appeared more frequently dependant on 'pie' or 'cider' in its corpus of American English. I did this with the dependancy symbol '=>'. I saw that for most of the time between 1800 and 2019 apple cider appeared more frequently in written language than apple pie. There was a large spike just before 1820 in mentions of apple cider but I couldn't learn why. Apple pie briefly overtook apple cider between the 70s and 1990 and then again in the early two thousands until recently.

Then I used the wildcard search to see the top ten variations of words that precede 'Football Club' in written American English. The phrase only started to be used in America in the early 20th century it seems. I was interested by the words that appeared before. Liverpool and Arsenal are both Premier League soccer teams. I support Liverpool so I was happy to see them on the list and their spike in attention around 2016 although I can't imagine why since we didn't have a particularly good season that year. City and United are both adjectives stuck onto the end of lots of towns' and cities' football teams, for example, the two Manchester soccer teams Man United and Man City. These lines therefore probably refer to the combination of lots of teams. I was interested to see some NFL team names. As far as I understand US sports teams are referred to as 'franchises' not clubs? English soccer teams are called clubs because they often started as local teams over 100 years ago. In any case some people have been writing about these teams and referring to them as clubs.

Part 2

I wrote a research paper on Moby Dick in my senior year of high school so I wanted to use a text I was familiar with.

Here is a cirrus of Moby Dick.

I found the relative frequency graphs quite useful. Queequeg begins making his own coffin in the second half of the novel.
The novel then ends with Ismael floating on his coffin after the Pequod is sunk.

I really enjoyed the link feature as you can clearly see which words frequently appear in
connection with other words and how different key words relate to each other.

I don't think it's very practical but the loom is a cool visualisation.

Part 3

'Like' is a positive +2 word, which is fine as a verb, but Sentimood can't read it in any other way. So when I wrote "you look like a garbage rat", Sentimood thought it was positive. 'Cancer' carries a -1 on Sentimood. While the word 'cancer' describes a horrible disease and the word is probably used 99% of the time in this way, it comes from the latin for crab, which is why there is a star constellation also called Cancer. Tropic of Cancer, which is about as neutral as neutral gets, is negative. This a niche complaint, though.

I think 'murder' and 'genocide' are weighted seriously wrong. Murder is only -2 while 'kill' is -3. I think murder is nearly always bad whereas people argue killing is sometimes justified? Genocide is always really bad, but Sentimood doesn't agree. It says genocide doesn't carry negative connotations. Sentimood.

The corporate demo and Sentimood agree that these quotes from 12th Night and Pharrell William's song Happy are positive.

However, they disagree about this quote from Cersei in Game of Thrones. Sentimood says it is quite positive.

Meanwhile, the corporate demo says it is negative.

They also disagree on this line by Lysander in A Midsummer Night's Dream.
The corporate demo says it is negative while Sentimood says it is positive.

Here are two examples of the corporate demo and Sentimood agreeing and being wrong. They think the line about rage from Iago is a positive (I disagree). My favourite example is them both thinking this line from Ramsay Bolton in Game of Thrones is positive. I can't think why. He also says it looking over a prisoner in his dungeon.

Part 4

I used Google Translate and Deepl between Russian and English. I think part of the difficulty in translating between Russian and English for more than their literal meaning is that their grammar is very different.

Here are two examples of both Google Translate and Deepl working. The phrase "manuscripts don't burn" is from a famous Russian novel "The Master and Margarita" by Mikhail Bulgakov. There is a manuscript that literally does not burn, but the phrase makes a point about art being irrepressible under Soviet censorship. It translates well in both cases. "Tear down this wall" is from Reagan's speech in front of the Berlin Wall. Both cases returned the same both ways through the translation.

In these cases the translation is not so good.
In the first case Google translate fails to identify from context that the English homonym "match" refers in this case to matchstick and not a sports game. However, deepl identifies that we're talking about matchsticks from the context and changed the word for match as I typed the second half of the sentence.
The second one isn't as egregious, but neither Google Translate nor Deepl identifies that "lie" here refers to someone being on the ground not speaking untruthfully.

I thought of the match example because that's how someone in my high school Russian class got caught using Google translate. I think Google is ok for basic stuff in Russian as long as you're a bit skeptical. Deepl seems pretty impressive to me and seems to be able to use context very effectively to deduce the user intention for translation.

Part 5

I did two experiments. I wanted the first one to be easy so the two classes were me in a red t-shirt and the other one was me in a dark navy turtleneck and glasses. The algorithm could tell pretty well which class the information I was presenting it with should fit in. I tried to confuse it by showing some red while wearing the turtleneck and glasses and it was a bit confused so I trained it to allow for that and then it put me in class 1 straight away when I tried again afterwards.

For the next one, I wanted the algorithm to distinguish between me wearing my glasses upside down or normally. It found this pretty difficult and would only sometimes and momentarily achieve 100% certainty. Initially more training didn't help until I moved closer to the camera, then it could tell.