League of Legends Gameplay Data Extraction using Computer Vision (Masters' Thesis)
Project Description:
This project focuses on data extraction from a match video of League of Legends, a widely played competitive multiplayer game, to better analyze play styles through data analytics. This is a research project supervised by Dr. Corey Clark, Director of Human & Machine Intelligence Game Lab at Southern Methodist University.
Development Specification:
- Language: Python
- Team Size: 2 Programmers and 1 Supervisor
- Development Time: 6 Months
- Domain: Computer Vision using Convolutional Neural Networks (CNNs)
Responsibilities
- Understood the limitations of conventional computer vision techniques like template matching and feature matching.
- Built a sample generator that generates labeled images for training the CNN.
- Trained the CNNs on the data generated and tested the models.
- Built a system to choose the CNN models to use to run inference on the video frames.
- Worked with the team to define a data format (JSON).
- Built a data post-processing system to help smooth the positional data over time.
- Built a bounding box visualizer and death location visualizer to validate the position data collected.
- Wrote a paper
Why do this?
Game analytics plays an important role in guiding teams in both traditional sports and electronic sports (eSports) by getting a better understanding of the opponent’s strategies. Unfortunately, not every eSport game has a reliable method to extract, collect and analyze the data needed for team and player analysis. For instance, League of Legends (LoL), a popular multiplayer online battle arena video game, does not provide the ability to track gameplay data like players’ position, hit points, and mana within the game without having to watch and manually encode the entire match. This project presents a methodology to extract, collect and analyze gameplay data from LoL videos via an automated computer vision and offline post processing techniques. We use template matching and feature matching to extract information from static UI elements. To track players’ position on a dynamic minimap, we utilize object detection via convolutional neural network.
Training Sample Generator
Any neural network requires training data to adjust its weights for the implementation. In this case the training data comprises of images with the icons of the heroes on them. I built a python script to generate minimap images overlaid with champion icons to ease the generation of training data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | # Loop through each icon image in src icon images list for idx, image_file in enumerate(os.scandir(src_image_dir)): offset = [0,0] counter = 0 img_name = image_file.name # Read overlay image and resize it to smaller image overlay = cv2.imread(image_file.path, cv2.IMREAD_UNCHANGED) overlay = cv2.resize( overlay, overlay_size ) object_name = img_name.replace('.png', '') print('Generating samples for: {}'.format(object_name)) # Format the path variables from the object_name annotations_path = input_annotations_path dst_image_dir = os.path.join(input_dst_image_dir) dst_image_dir += '\\' annotations_path += '\\' print('Writing samples to {}'.format(dst_image_dir)) print('Writing annotations to {}'.format(annotations_path)) if not os.path.isdir(input_dst_image_dir): os.mkdir(input_dst_image_dir) print('Creating output directory for {}'.format(input_dst_image_dir)) if not os.path.isdir(input_annotations_path): os.mkdir(input_annotations_path) print('Creating output directory for {}'.format(input_annotations_path)) if not os.path.isdir(dst_image_dir) : os.mkdir(dst_image_dir) if not os.path.isdir(annotations_path) : os.mkdir(annotations_path) while(counter < total_count): # Choose a random background image from the bg list bg_file_name = GetRandomFileNameInDir( bg_image_dir ) bg_image_path = os.path.join(bg_image_dir, bg_file_name) bg_choice = random.randint(0, len(bg_images) - 1) bg = bg_images[bg_choice] result = bg # Sprinkle other heroes num_heroes_to_sprinkle = random.randint(0, 9) for idx in range(num_heroes_to_sprinkle): hero_icon_idx = random.randint(0, len(images) - 1) hero_icon = images[hero_icon_idx] offset[0] = random.randint(0, bg.shape[0] - hero_icon.shape[0]) offset[1] = random.randint(0, bg.shape[1] - hero_icon.shape[1]) result = GenerateOverlayImage( result, hero_icon, offset ) # Generate a random offset depending on size of the smaller image and larger image offset[0] = random.randint(0, bg.shape[0] - overlay.shape[0]) offset[1] = random.randint(0, bg.shape[1] - overlay.shape[1]) # Generate the overlaid image result = GenerateOverlayImage( result, overlay, offset ) out_img_name = '{}_{}.jpg'.format(object_name, counter) out_path = os.path.join(dst_image_dir, out_img_name) # Annotations topLeft = [offset[1], offset[0]] botRight = [offset[1] + overlay.shape[1], offset[0] + overlay.shape[0]] folder_tag = 'images' WriteXMLv2(folder_tag, result, out_img_name, out_path, [object_name], topLeft, botRight, annotations_path) print('Offset:{} --- DestImage:{}'.format(offset, out_path)) # Add noise if enabled if random.random() < enable_noise: noise_gen.noisy('s&p', result) # Write the image to file cv2.imwrite( out_path, result, [int(cv2.IMWRITE_JPEG_QUALITY), 100]) counter += 1 |
Inference and Data Output
Usually, a single CNN model would be trained to detect all champions. This approach led to a lot of false positives since the icons have similar features and the icons are small on the minimap. To improve the accuracy of prediction, I trained a CNN model for each champion. Although this increases the inference time 10 fold, it decreased the amount of false positives to a maximum of 5% of the frames. Before running the CNN on the video frames, we use template matching and feature matching to identify the frames to run inference on and also to identify the champions that are part of the match. This step generates a JSON file that contains frame numbers and champion states gathered from the other static UI elements. Using the JSON file created, I run the corresponding champion’s CNN model on the frames described by the frame numbers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | # Initializes a classifier network for the label def init_classifier( label ): label_str = label.lower() options = { "model": 'cfg/tiny-yolo-voc-1c-' + label_str + '.cfg', "load": 3100, "gpu": 0.8, "labels": 'data/individual_detectors/' + label + '/labels.txt', "threshold": 0.2, } classifier_network = TFNet(options) return classifier_network if __name__=="__main__"": for label in labels: frame_counter = 1 # init the network classifier = init_classifier(label) # Iterate through each frame described in the input json for frame_obj in match_frames[1:]: # Seek to the frame number frame_num = int(frame_obj["Frame"]) capture.set(cv2.CAP_PROP_POS_FRAMES, frame_num) # Capture each frame isValid, frame = capture.read() if (isValid == False): break # Check if hero is alive and skip this loop if not is_alive = frame_obj["Champions"][label]["is_alive"] if is_alive == False: bbox = box([0,0], [0,0], False) champions_obj = frame_obj["Champions"] champions_obj[label]["Position"] = bbox.get_center() champions_obj[label]["Bbox"] = { "TopLeft": bbox.topleft, "BotRight": bbox.botright } frame_counter += 1 progress_percent = (frame_counter / totalFrames) * 100 sys.stdout.write("\rLabel:{} Progress: {:.0f}%".format(label, progress_percent)) sys.stdout.flush() continue # Classify image prediction_result = classify_frame_with_network( frame, classifier ) # Get bounding box details for the obj bbox = get_bbox_for_obj_from_prediction(label, prediction_result) center = bbox.get_center() champions_obj = frame_obj["Champions"] champions_obj[label]["Position"] = center champions_obj[label]["Bbox"] = { "TopLeft": bbox.topleft, "BotRight": bbox.botright } frame_counter += 1 |
Post-Processing Positional Data
While the accuracy was improved over the multi-object detector model by using individual object detectors, there were still losses in detection i.e. there would be frames in which champions would not be identified even though they were alive. These losses were mostly caused by overlaps of champion icons over other icons that show up (e.g. other champion icons, ping icon, minimap cursor, etc). The frames with these kinds of losses were tagged so that they can later be post processed to obtain a position with a marginal error. In this case, marginal error is defined as a position that shows the approximate vicinity of the actual position.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | if __name__ == '__main__': # Load the input json json_obj = None with open(in_json_path, 'r') as in_json_file: json_str = in_json_file.read() json_obj = json.loads(json_str) match_frames = json_obj["MatchFrames"] for label in labels: last_good_frame_idx = -1 last_bad_frame_idx = -1 last_good_location = [] last_good_box = None # Iterate through match frames and assign position from the last good frame to the previous ones for frame_idx, frame_obj in enumerate(match_frames): location = frame_obj["Champions"][label]["Position"] bbox = box(frame_obj["Champions"][label]["Bbox"]["TopLeft"], frame_obj["Champions"][label]["Bbox"]["BotRight"]) if location[0] == -1: last_bad_frame_idx = frame_idx populate_frame_info_from_last_good(last_good_location, last_good_box, last_good_frame_idx, frame_idx, label, match_frames) print("Reading Frame Index:", frame_idx) else: # Special case for first frame if(last_good_frame_idx == -1): last_good_location = location.copy() last_good_frame_idx = frame_idx last_good_box = bbox else: is_loc_good_enough = False for idx in range(0, num_frames_to_check): # If idx goes out of bounds, set it to the location previously in memory if (idx + frame_idx) >= len(match_frames): last_good_location = location.copy() last_good_frame_idx = frame_idx last_good_box = bbox break else: cur_frame_obj = match_frames[idx + frame_idx] cur_location = cur_frame_obj["Champions"][label]["Position"] if check_location_nearly_equal(location, cur_location): is_loc_good_enough = True else: is_loc_good_enough = False break if is_loc_good_enough: last_good_location = location.copy() last_good_frame_idx = frame_idx last_good_box = bbox |
Validation of Results
To validate the results obtained by this method, I built a death location visualizer and a bounding box visualizer.
Postmortem
- YOLO CNN model was relatively fast compared to other pre-trained models when it comes to training and inference.
- Got to work on a research project and was able to write a full-fledged paper.
- Got to work with neural networks and learn how they work.
- Identified a bit late that there were bad images in the training data. Some of the images I used initially caused errors in detection.
- Although, I worked with neural networks using Tensorflow for the entirety of the project, I wasn’t able to work on Tensorflow (low-level) since it’s all abstracted within the pre-trained model.
- Training took a lot of time. Keeping up with estimates was rather hard.
- Training and inference is more of an educated trial-and-error. Neural networks are still more of a black box.
- Using a machine learning model and understanding the inner workings takes way longer. Having short time constraints is hard since the implementation itself will take a lot of time.
- Training is very time consuming and thus budget more time than initially thought in-case a trained model doesn’t perform as well.