1. Overview
Welcome to the Realtime on-device in-app purchase optimization codelab. In this codelab you'll learn how to use TensorFlow Lite and Firebase to train and deploy a custom personalization model to your app.
This tutorial shows how to build a machine learning model for personalization, in particular one that predicts the optimal in-app purchase (IAP) offering given a state the current user is in. This is an example of a contextual bandit problem, an important and widely applicable kind of machine learning problem that you'll learn more about in this codelab
What you'll learn
- Collect analytics data via Firebase analytics
- Preprocess analytics data using BigQuery
- Train a simple ML model for on-device optimization of in-app purchases (IAPs)
- Deploy TFLite models to Firebase ML and access them from your app
- Measure and experiment with different models via Firebase A/B Testing
- Train and deploy new models using latest data on a recurring cadence
What you'll need
- Android Studio version 3.4+
- A physical test device with Android 2.3+ and Google Play services 9.8 or later, or an Emulator with Google Play services 9.8 or later
- If using a physical test device, a connection cable
- Novice ML knowledge
How will you use this tutorial?
How would you rate your experience with building Android apps?
2. Problem Statement
Let's say you are a game developer who wants to show personalized in-app purchase (IAP) suggestions at the end of each level. You can only show a limited number of IAP options each time, and you don't know which ones will have the best conversion. Given each user and each session is different, how do we go about finding the IAP offer that gives the highest expected reward?
3. Get the sample code
Clone the GitHub repository from the command line.
git clone https://github.com/googlecodelabs/firebase-iap-optimization.git
This repo contains:
- A Jupyter notebook (.ipynb) that trains the personalization model and packages it into a TFLite model
- A sample Kotlin app that uses the TFLite model to make predictions on-device
4. Run the app with Firebase
In this codelab, we will work on optimizing the IAPs of our fictional game app - Flappy Sparky. The game is a side-scroller where the player controls a Sparky, attempting to fly between columns of walls without hitting them. At the beginning of the level, the user is presented with an IAP offer that will give them a powerup. We'll only be implementing the IAP optimization portion of the app in this codelab.
You will be able to apply what you learn here to your own app that's connected to a Firebase project. Alternatively, you can create a new Firebase project for this codelab. If you need help getting started with Firebase, please see our tutorials on this topic ( Android and iOS).
5. Collect analytics events in your app
Analytics events provide insight into user behavior, and are used to train the ML model. For example, the model may learn that users who play for longer are more likely to make an IAP to get extra lives. The ML model needs analytics events as input to learn this information.
Some analytics events we may want to log include:
- How long the user plays the game
- What level the user reaches
- How many coins the user spends
- Which items the user buys
Download sample data (Optional)
In the following steps, we'll use Firebase Analytics to log analytics events to use in our model. If you already have analytics data you want to use, jump to the "Train the optimization model" section of this codelab and you can follow along with our sample data.
Collect Data with Firebase Analytics SDK
We will use Firebase Analytics to help collect these analytics events. The Firebase Analytics SDK automatically captures a number of events and user properties. It also allows you to define your own custom events to measure the events that are unique to your app.
Installing Firebase Analytics SDK
You can get started with Firebase Analytics in your app by following the Get Started with Google Analytics documentation. The firebase-iap-optimization
repository cloned at the beginning of this codelab already includes the Firebase Analytics SDK.
Log custom events
After setting up the Firebase Analytics SDK, we can start logging the events we need to train our model.
Before we do that, it's important to set a user ID in the analytics event, so we can associate analytics data for that user with their existing data in the app.
MainActivity.kt
firebaseAnalytics.setUserId("player1")
Next we can log player events. For IAP optimization, we want to log each IAP offer presented to the user and whether that offer is clicked on by the user. This will give us two analytics events - offer_iap
and offer_accepted
. We'll also keep track of a unique offer_id so we can use it later to combine these data to see if an offer is accepted.
MainActivity.kt
predictButton?.setOnClickListener {
predictionResult = iapOptimizer.predict()
firebaseAnalytics.logEvent("offer_iap"){
param("offer_type", predictionResult)
param("offer_id", sessionId)
}
}
acceptButton?.setOnClickListener {
firebaseAnalytics.logEvent("offer_accepted") {
param("offer_type", predictionResult)
param("offer_id", sessionId)
}
}
For more information on logging custom events, visit the Firebase Analytics Log Events documentation.
6. Preprocess data in BigQuery
In the last step, we collected events about which IAP offer is presented to the user and which IAP offer is clicked on by the user. In this step, we will combine this event data with user data so our model can learn from a complete picture.
To do this, we will need to start by exporting the analytics events to BigQuery.
Link your Firebase project to BigQuery
To link your Firebase project and its apps to BigQuery:
- Sign in to Firebase.
- Click , then select Project Settings.
- On the Project Settings page, click the Integrations tab.
- On the BigQuery card, click Link.
(Optional) Export your Firestore collections to BigQuery
In this step, you have the option to export additional user data from Firestore to BigQuery to use to help train the model. If you'd like to skip this step for now, jump to the "Preparing data in BigQuery" section of this codelab and you can follow along with the Firebase Analytics events logged in the last step.
Firestore may be where you've stored users' signup date, in-app purchases made, levels in the game, coins in balance, or any other attributes that might be useful in training the model.
To export your Firestore collections to BigQuery, you can install the Firestore BigQuery Export Extension. Then, join tables in BigQuery to combine this data with the data from Google Analytics to use in your personalization model and throughout the rest of this codelab.
Preparing data in BigQuery
In the next few steps, we will use BigQuery to transform our raw analytics data into data usable for training our model.
In order for our model to learn which IAP offer to present based on the user and the game state, we need to organize data about the following:
- the user
- the game state
- the offer presented
- whether the presented offer is clicked on or not
All this data will need to be organized into a single row in a table for our model to process it. Luckily, BigQuery is set up to help us do just that.
BigQuery allows creating "views" to keep your query organized. A view is a virtual table defined by a SQL query. When you create a view, you query it in the same way you query a table. Using this we can first clean our analytics data.
To see if each in-app purchase offer is clicked on, we will need to join the offer_iap
and offer_accepted
events that we logged in the previous step.
all_offers_joined - BigQuery view
SELECT
iap_offers.*,
CASE
WHEN accepted_offers.accepted IS NULL THEN FALSE ELSE TRUE
END
is_clicked,
FROM
`iap-optimization.ml_sample.accepted_offers` AS accepted_offers
RIGHT JOIN
`iap-optimization.ml_sample.iap_offers` AS iap_offers
ON
accepted_offers.offer_id =iap_offers.offer_id;
all_offers_with_user_data - BigQuery view
SELECT
offers.is_clicked,
offers.presented_powerup,
offers.last_run_end_reason,
offers.event_timestamp,
users.*
FROM
`iap-optimization.ml_sample.all_offers_joined` AS offers
LEFT JOIN
`iap-optimization.ml_sample.all_users` AS users
ON
users.user_id = offers.user_id;
Export bigQuery dataset to Google Cloud Storage
Lastly, we can export the bigquery dataset to GCS so we can use it in our model training.
7. Train the optimization model
Sample data
Use either your data from the previous step, "Preprocess data in BigQuery," or the downloadable sample data provided here to follow along with the rest of this codelab.
Problem definition
Before we start training the model, let's spend some time defining our contextual bandits problem.
Contextual bandits explained
At the beginning of each level in Flappy Sparky, the user is presented with an IAP offer that will give them a powerup. We can only show one IAP option each time, and we don't know which ones will have the best conversion. Given each user and each session is different, how do we go about finding the IAP offer that gives the highest expected reward?
In this case, let's make the reward 0 if the user doesn't accept the IAP offer, and the IAP value if they do. To try to maximize your reward, we can use our historical data to train a model that predicts the expected reward for each action given a user, and find the action with the highest reward.
The following is what we will use in the prediction:
- State: information about the user and their current session
- Action: IAP offers we can choose to show
- Reward: value of the IAP offer
Exploitation vs Exploration
For all multi-armed bandits problems, the algorithm needs to balance between exploration (getting more data to learn which action gives the optimal result) and exploitation (using the optimal result to obtain the highest reward).
In our version of the problem, we will simplify this to only train the model periodically in the cloud and only do predictions when using the model on the user's device (as opposed to training on the user's device as well). To make sure we have sufficient training data after we use the model, we'll need to show randomized results to our app users sometimes (e.g. 30%). This strategy of balancing exploration and exploitation is called Epsilon-greedy.
Training the model
You can use the training script (training.ipynb
) provided with the codelab to get started. Our goal is to train a model that predicts the expected rewards for each action given a state, then we find the action that gives us the highest expected rewards.
Training locally
The easiest way to get started with training your own model is to make a copy of the notebook in the code sample for this codelab.
You don't need a GPU for this codelab, but if you need a more powerful machine to explore your own data and train your own model, you can get an AI Platform Notebook instance to speed up your training.
In the training script provided, we created an iterator that generates training data from the CSV files we exported from BigQuery. Then we used the data to start training our model with Keras. Details of how to train the model can be found in the comments of the Python notebook.
Measure the model performance
While training the model, we will compare it against a random agent that selects IAP offers randomly to see if our model is actually learning. This logic lives under ValidationCallback
.
At the end of training, we use data in test.csv
to test our model again. The model has never seen these data before, so we can be confident the result is not due to overfitting. In this case, the model performs 28% better than the random agent.
Export the TFLite model
Now we have a trained model ready to use, except it's currently in a TensorFlow format. We'll need to export the model as TFLite format so it can be run on mobile devices.
train.ipynb
converter = tflite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with tf.io.gfile.GFile('iap-optimizer.tflite', 'wb') as f:
f.write(tflite_model)
From here, you can download the model and bundle the model with your app.
Optionally, for a production app, we recommend that you deploy the model to Firebase ML and have Firebase host your model. This is useful for two main reasons:
- We can keep the app install size small and only download the model if needed
- The model can be updated regularly and with a different release cycle than the entire app
To learn how to deploy the model to Firebase ML, you can follow the Add Firebase to your TFLite-powered Android App codelab. You have the option of deploying using the Firebase console or the Python API.
8. Making predictions on-device
The next step is to make predictions using the model on-device. You can find an example app that downloads a model from Firebase ML in the app
folder of the sample code you downloaded, and use it to perform inference with some client-side data.
Because we applied some preprocessing during model training, we will need to apply the same preprocessing to the model input when running on-device. A simple way to do this is to use a platform and language independent format such as a JSON file containing a map of every feature to metadata about how the preprocessing is done. You can find more detail on how this is done in the example app.
Next, we give the model a test input as follow:
IapOptimzer.kt
val testInput = mapOf(
"coins_spent" to 2048f,
"distance_avg" to 1234f,
"device_os" to "ANDROID",
"game_day" to 10f,
"geo_country" to "Canada",
"last_run_end_reason" to "laser"
)
The model suggests sparky_armor
is the best IAP powerup for this particular user.
Measure model accuracy
To measure our model accuracy, we can simply keep track of the IAP offers predicted by our model and whether they are clicked on using Firebase Analytics. You can use this together with Firebase A/B testing to measure the actual performance of the model. Taking it one step further, you can also perform A/B tests on different iterations of the model. You can learn more about A/B testing with Firebase in the Create Firebase Remote Config Experiments with A/B Testing documentation.
9. (Optional): Updating model regularly with new data
If you need to update your model as new data comes in, you can set up a pipeline to retrain your model on a recurring basis. To do this, you need to first make sure you have new data to use for training using the epsilon-greedy strategy we mentioned above. (e.g. Using the model prediction result 70% of the time and using random results 30% of the time).
Configuring a pipeline for training and deploying with new data is beyond the scope of this codelab, you can check out Google Cloud AI Platform and TFX to get started.
10. Congratulations!
In this codelab, you learned how to train and deploy an on-device TFLite model for optimizing in-app-purchases using Firebase. To learn more about TFLite and Firebase, take a look at other TFLite samples and the Firebase getting started guides.
If you have any questions, you can leave them at Stack Overflow #firebase-machine-learning.
What we've covered
- TensorFlow Lite
- Firebase ML
- Firebase Analytics
- BigQuery
Next Steps
- Train and deploy an optimizer model for your app.