There has widely been a misconception that AI is going to solve all the business's problems with just a sprinkle of GPT and boom. This is hardly ever the case. Here, I'll lay out a much more reasonable and more importantly repeatable and predictable path to success. We'll use a sports betting algorithm for fantasy baseball to illustrate the progression. Human expertise -> Expert system -> Learning model -> Artificial intelligence
Human Expertise
Start with a human expert. How is the process done today? What are we replacing? Analyze the steps and start teaching the process to another human. Anything can be taught, intuition is just a bad teacher. In this case, what is the winning strategy to win at sports bets. Two strategies we will discuss is average points scored by a batter compared against tonight's salary, and stacking the team. If you analyze average points and compare it to the salary, you can come up with a point per dollar spent, to get a general idea of who's on "sale" tonight. This informs a better choice. Team stacking comes when grand slams are hit, or there is a major shutout of another team, and lots of runs are hit. This creates many correlated points, with similar probability. Analyzing these strategies and then teaching them to an alorithmist, you can then devise a code based expert system.
Building an expert system involves simply codifying the rules and strategies from above into code. For the first strategy, we get a more accurate view of the "average" score for the player, by scraping the stats from baseballreference.com instead of draftkings. This gets us higher quality data and the division with the salary gets us the cheapness of the player. The computer lets us do this for all players in a few minutes without human intervention, so we've already created a win. Here's the stacked method in golang:
func stacked(players []Player) bool { var teamCounts map[string]int teamCounts = make(map[string]int) for _, player := range players { if player.lineupSlot == 28 { continue } teamCounts[player.team] += 1 } oneStack := false stackTeam := "" for team, tc := range teamCounts { if tc > 3 { return false } if tc == 3 { if oneStack { return false } oneStack = true stackTeam = team } } if !oneStack { return false } _ = stackTeam return true }
This checks that there are exactly three players from one team on the fantasy team. Now we have a stacked team, and can play our correllated strategy.
Expert System
The next step is to find areas to include into a learning model. Here, it requires a bit of creativity, and to realize that fantasy baseball is a two player game. It's a matchup between a pitcher and a batter. All points in the fantasy system are largely independent of what the fielding team does. It's primarily the skill of the batter as paired with the pitcher. So. Instead of using an average of the batter points and pitcher points for tonight, what we can do is to normalize it against the starting pitcher and maybe one relief pitcher. What this does is get a more accurate view of the "average" and value for tonight. It looks like this in python:
def get_loss_batter(cutoff): def my_loss(y_true, y_pred): #q = .30 diff = y_true - y_pred return K.sqrt(K.mean(diff*diff, axis=-1)) return my_loss def get_loss_pitcher(cutoff): def my_loss(y_true, y_pred): #q = .35 diff = y_true - y_pred return K.sqrt(K.mean(diff*diff, axis=-1)) return my_loss def build_model(inputWidth, cutoff, lossFunc): init1 = keras.initializers.RandomNormal(mean=0.0, stddev=0.12, seed=4) init2 = keras.initializers.RandomNormal(mean=0.0, stddev=0.12, seed=1) print 'width ',inputWidth,max(1,int(round((inputWidth+1)/2))) model = keras.Sequential([ layers.Dense(int(round((inputWidth+1)/2)), activation=tf.nn.relu, input_shape=[inputWidth], kernel_initializer=init1), #layers.Dense(int(round((inputWidth+1)/4))+2, activation=tf.nn.relu, kernel_initializer=init2), layers.Dense(1) ]) #optimizer = keras.optimizers.Adam() #optimizer = keras.optimizers.SGD(nesterov=False) optimizer = keras.optimizers.RMSprop() model.compile(loss=lossFunc, optimizer=optimizer, metrics=['mean_absolute_error', 'mean_squared_error', lossFunc]) return model
This uses the learning modules tensorflow and keras. Now with a more accurate view of the average scores, we're ahead of the average gambler we go up against.
Learning Model
Finally, you have an AI system. Taking scraped data, putting into a model, and gleaning more than was possible without manipulating and crunching the data. Here's the main drive and regression model for the baseball data:
def driveTrain(which, inputWidth, train_dataset, train_labels, test_dataset, test_labels, hiddens, cutoff, weightsfile, toPredict, mean, maxx, minn, scoreScaler, lossFunc, save=True, load=True, epochs=2): model, test_predictions, history = train(inputWidth, train_dataset, train_labels, test_dataset, test_labels, weightsfile, cutoff, lossFunc, epochs=epochs, load=load) done = errorCalcs(model, test_predictions, test_labels,test_dataset, cutoff, maxx, minn, scoreScaler) if save: model.save_weights(weightsfile) f = open(weightsfile) lines = f.read() f.close() uploadData(lines, 'training/', 'weights.txt') if not toPredict.size: return None, None predicted = model.predict(toPredict) predictedUnNormal = [] for pp in predicted: ans = int(round((pp * (maxx-minn) + mean) * scoreScaler)) predictedUnNormal.append(ans) contents = '' for i in range(len(predictedUnNormal)): contents += str(hiddens[i]) + ',' + str(predictedUnNormal[i]) + '\n' contents = contents[:-1] log(contents) folder = getFolderTraining() uploadData(contents, folder, which+'predictions.csv') return test_predictions, history
Artificial Intelligence
A cherry on the top is being able to do other tricks to maximize win probability. After coming up with the best constructed fantasy team for us to bet on tonight, we analyzed the bettor pool. Look at the bettors that we're up against in a particular contest. Sign up for each, check the competition, and drop out of the highest competitor pools. By analyzing past success of other gamblers, you choose the losingest and make the biggest bank.
This should shed some light on the proper progression to creating an "AI", or simply an algorithm that uses data. AI gets too much hype, and business owners think they can throw a ton of spaghetti into a computer, and it will spit out gold. You need a skilled algorithm dev to make this happen, and the process always looks something a little like what's above. It almost always starts with human expertise. Fin.