Defcon Short story contest entry by be3n
Chapter 1 : Knowledge Distribution and Collection
In a large lecture hall only moderately filled, Yohan stood before a class of first year computer science students. The dark rings under his brown eyes and matted light brown hair highlighted his rumpled and unkempt appearance. He was of average height with broad shoulders and a bit of a barrel chest. Barely older than his students, he hardly had their attention as he began to speak.
“Infinite monkeys pounding on keyboards will eventually produce Shakespeare! We’ve all heard this. Is that machine learning? Are these nearly infinite transistors your army of monkeys?”, he asks. A few students timidly raised hands before he answered himself, “No, we are not leveraging the power of infinite monkeys here. Here we teach learning systems not just how to learn, but how to teach themselves to learn even better,” he continued. The students started to raise eyes from their glowing screens to follow Yohan as he slowly drew their attention.
Speaking enthusiastically, his eyes began to brighten, “Your model is derived by the learning systems analysis of the data you feed it. Bad data creates bad learning. One famous example from the early days of machine learning was an Army effort to train a system to detect tanks from aerial reconnaissance. The scientists working on the project did not notice at the time that all the photographs taken containing tanks in the sample were taken on overcast days. Most of the photos that did not contain tanks, had been taken on sunny days. In the end, they did not train the model to detect tanks, but instead to detect cloudy weather. If you are not careful this can happen to you.
“False assumptions lead to false predictions and the model degrades. It is increasingly important to properly select the data points for your matrix as well as allowing for the training of weights and biases assigned to these data points.
“Just like our tank example, we can find other false assumptions. The racist biases that lead to the idea that immigrants bring crime can also be attributed to a bad learning data set. In our immigrant example, that learning data set could be the content produced by Fox News. Bias is not always bad, we need them to judge the significance of our data. For another example, say you wanted to go out to eat. In order to select a location, you might ask a friend, Paul for his favorite restaurant. He suggests a Bowling alley diner. Does this choice reflect bias? Might it help to know going in that Paul’s favorite food is hot dogs and bowls every weekend as part of a league? Should it affect the significance you mentally apply to his suggestion? These are the weights and biases that we apply in our everyday decision making. These same sort of weights and biases must be trained into the learning model. And they are, every time. It is usually not possible to eliminate input bias entirely..” Many in the class seemed a bit perplexed by all this new information.
“Rule of acquisition #74 states that knowledge equals profit. In our case knowledge equals model. The more training you can supply, the more points in your matrix, the greater and more accurate the predictability. How many of you have worked with confusion matrices?” Someone called out, “They’re all confusion matrices to me,” but Yohan didn’t reply, he just waited patiently. . In time nearly all the students raised their hands. . That was a good sign. He continued his lecture going over some examples of using bias of various data to increase the flexibility of the model. . . By the time he was talking about practical uses for stochastic gradiants, most of his students’ attention had wandered. Continue reading “My Defcon 26 Short Story Contest Entry Part 1! Finally!” →