League of Legends Analyzer (Masters’ Thesis)

League of Legends Gameplay Data Extraction using Computer Vision (Masters' Thesis)

Project Description:

This project focuses on data extraction from a match video of League of Legends, a widely played competitive multiplayer game, to better analyze play styles through data analytics. This is a research project supervised by Dr. Corey Clark, Director of Human & Machine Intelligence Game Lab at Southern Methodist University.

Development Specification:

Language: Python
Team Size: 2 Programmers and 1 Supervisor
Development Time: 6 Months
Domain: Computer Vision using Convolutional Neural Networks (CNNs)

Responsibilities

Understood the limitations of conventional computer vision techniques like template matching and feature matching.
Built a sample generator that generates labeled images for training the CNN.
Trained the CNNs on the data generated and tested the models.
Built a system to choose the CNN models to use to run inference on the video frames.
Worked with the team to define a data format (JSON).
Built a data post-processing system to help smooth the positional data over time.
Built a bounding box visualizer and death location visualizer to validate the position data collected.
Wrote a paper

Why do this?

Game analytics plays an important role in guiding teams in both traditional sports and electronic sports (eSports) by getting a better understanding of the opponent’s strategies. Unfortunately, not every eSport game has a reliable method to extract, collect and analyze the data needed for team and player analysis. For instance, League of Legends (LoL), a popular multiplayer online battle arena video game, does not provide the ability to track gameplay data like players’ position, hit points, and mana within the game without having to watch and manually encode the entire match. This project presents a methodology to extract, collect and analyze gameplay data from LoL videos via an automated computer vision and offline post processing techniques. We use template matching and feature matching to extract information from static UI elements. To track players’ position on a dynamic minimap, we utilize object detection via convolutional neural network.

Training Sample Generator

Any neural network requires training data to adjust its weights for the implementation. In this case the training data comprises of images with the icons of the heroes on them. I built a python script to generate minimap images overlaid with champion icons to ease the generation of training data.

Image Output

Annotation Example

Source (Python)

Image Output

Annotation Example

Source (Python)

# Loop through each icon image in src icon images list
    for idx, image_file in enumerate(os.scandir(src_image_dir)):
        offset = [0,0]
        counter = 0
        img_name = image_file.name

        # Read overlay image and resize it to smaller image
        overlay = cv2.imread(image_file.path, cv2.IMREAD_UNCHANGED)
        overlay = cv2.resize( overlay, overlay_size )
        object_name = img_name.replace('.png', '')
        print('Generating samples for: {}'.format(object_name))

        # Format the path variables from the object_name 
        annotations_path = input_annotations_path
        dst_image_dir = os.path.join(input_dst_image_dir)
        dst_image_dir += '\\'
        annotations_path += '\\'

        print('Writing samples to {}'.format(dst_image_dir))
        print('Writing annotations to {}'.format(annotations_path))

        if not os.path.isdir(input_dst_image_dir):
            os.mkdir(input_dst_image_dir)
            print('Creating output directory for {}'.format(input_dst_image_dir))

        if not os.path.isdir(input_annotations_path):
            os.mkdir(input_annotations_path)
            print('Creating output directory for {}'.format(input_annotations_path))

        if not os.path.isdir(dst_image_dir) :
            os.mkdir(dst_image_dir)
    
        if not os.path.isdir(annotations_path) :
            os.mkdir(annotations_path)
        
        while(counter < total_count):
            # Choose a random background image from the bg list          
            bg_file_name = GetRandomFileNameInDir( bg_image_dir )
            bg_image_path = os.path.join(bg_image_dir, bg_file_name)
            bg_choice = random.randint(0, len(bg_images) - 1)
            bg = bg_images[bg_choice]

            result = bg
            # Sprinkle other heroes
            num_heroes_to_sprinkle = random.randint(0, 9)
            for idx in range(num_heroes_to_sprinkle):
                hero_icon_idx = random.randint(0, len(images) - 1)
                hero_icon = images[hero_icon_idx]
                offset[0] = random.randint(0, bg.shape[0] - hero_icon.shape[0])
                offset[1] = random.randint(0, bg.shape[1] - hero_icon.shape[1])
                result = GenerateOverlayImage( result, hero_icon, offset )

            # Generate a random offset depending on size of the smaller image and larger image
            offset[0] = random.randint(0, bg.shape[0] - overlay.shape[0])
            offset[1] = random.randint(0, bg.shape[1] - overlay.shape[1])

            # Generate the overlaid image
            result = GenerateOverlayImage( result, overlay, offset )
            out_img_name = '{}_{}.jpg'.format(object_name, counter)
            out_path = os.path.join(dst_image_dir, out_img_name)

            # Annotations
            topLeft = [offset[1], offset[0]]
            botRight = [offset[1] + overlay.shape[1], offset[0] + overlay.shape[0]]
            folder_tag = 'images'
            WriteXMLv2(folder_tag, result, out_img_name, out_path, [object_name], topLeft, botRight, annotations_path)
            print('Offset:{} --- DestImage:{}'.format(offset, out_path))

            

            # Add noise if enabled
            if random.random() < enable_noise:
                noise_gen.noisy('s&p', result)

            # Write the image to file
            cv2.imwrite( out_path, result, [int(cv2.IMWRITE_JPEG_QUALITY), 100])

            counter += 1

Inference and Data Output

Usually, a single CNN model would be trained to detect all champions. This approach led to a lot of false positives since the icons have similar features and the icons are small on the minimap. To improve the accuracy of prediction, I trained a CNN model for each champion. Although this increases the inference time 10 fold, it decreased the amount of false positives to a maximum of 5% of the frames. Before running the CNN on the video frames, we use template matching and feature matching to identify the frames to run inference on and also to identify the champions that are part of the match. This step generates a JSON file that contains frame numbers and champion states gathered from the other static UI elements. Using the JSON file created, I run the corresponding champion’s CNN model on the frames described by the frame numbers.

Source (Python)

JSON Output

Source (Python)

# Initializes a classifier network for the label
def init_classifier( label ):
    label_str = label.lower()
    options = {
    "model": 'cfg/tiny-yolo-voc-1c-' + label_str + '.cfg',
    "load": 3100,
    "gpu": 0.8,
    "labels": 'data/individual_detectors/' + label + '/labels.txt',
    "threshold": 0.2,
    }
    classifier_network = TFNet(options)
    
    return classifier_network

if __name__=="__main__"":
    for label in labels:
        frame_counter = 1

        # init the network
        classifier = init_classifier(label)

        # Iterate through each frame described in the input json
        for frame_obj in match_frames[1:]:

            # Seek to the frame number
            frame_num = int(frame_obj["Frame"])
            capture.set(cv2.CAP_PROP_POS_FRAMES, frame_num)

            # Capture each frame
            isValid, frame = capture.read()
            if (isValid == False):
                break

            # Check if hero is alive and skip this loop if not
            is_alive = frame_obj["Champions"][label]["is_alive"]
            if is_alive == False:
                bbox = box([0,0], [0,0], False)
                champions_obj = frame_obj["Champions"]
                champions_obj[label]["Position"] = bbox.get_center()
                champions_obj[label]["Bbox"] = {
                "TopLeft": bbox.topleft,
                "BotRight": bbox.botright
                }
                frame_counter += 1
                progress_percent = (frame_counter / totalFrames) * 100
                sys.stdout.write("\rLabel:{} Progress: {:.0f}%".format(label, progress_percent))
                sys.stdout.flush()
                continue

            # Classify image
            prediction_result = classify_frame_with_network( frame, classifier )

            # Get bounding box details for the obj
            bbox = get_bbox_for_obj_from_prediction(label, prediction_result)

            center = bbox.get_center()
            champions_obj = frame_obj["Champions"]
            champions_obj[label]["Position"] = center
            champions_obj[label]["Bbox"] = {
                "TopLeft": bbox.topleft,
                "BotRight": bbox.botright
            }

            frame_counter += 1

JSON Output

Post-Processing Positional Data

While the accuracy was improved over the multi-object detector model by using individual object detectors, there were still losses in detection i.e. there would be frames in which champions would not be identified even though they were alive. These losses were mostly caused by overlaps of champion icons over other icons that show up (e.g. other champion icons, ping icon, minimap cursor, etc). The frames with these kinds of losses were tagged so that they can later be post processed to obtain a position with a marginal error. In this case, marginal error is defined as a position that shows the approximate vicinity of the actual position.

if __name__ == '__main__':

    # Load the input json
    json_obj = None
    with open(in_json_path, 'r') as in_json_file:
        json_str = in_json_file.read()
        json_obj = json.loads(json_str)

    match_frames = json_obj["MatchFrames"]
    
    for label in labels:

        last_good_frame_idx = -1
        last_bad_frame_idx = -1
        last_good_location = []
        last_good_box = None

        # Iterate through match frames and assign position from the last good frame to the previous ones
        for frame_idx, frame_obj in enumerate(match_frames):
            location = frame_obj["Champions"][label]["Position"]
            bbox = box(frame_obj["Champions"][label]["Bbox"]["TopLeft"], frame_obj["Champions"][label]["Bbox"]["BotRight"])


            if location[0] == -1:
                last_bad_frame_idx = frame_idx
                populate_frame_info_from_last_good(last_good_location, last_good_box, last_good_frame_idx, frame_idx, label, match_frames)
                print("Reading Frame Index:", frame_idx)
            
            else:
                # Special case for first frame
                if(last_good_frame_idx == -1):
                    last_good_location = location.copy()
                    last_good_frame_idx = frame_idx
                    last_good_box = bbox
                else:
                    is_loc_good_enough = False
                    for idx in range(0, num_frames_to_check):
                        # If idx goes out of bounds, set it to the location previously in memory
                        if (idx + frame_idx) >= len(match_frames):
                            last_good_location = location.copy()
                            last_good_frame_idx = frame_idx
                            last_good_box = bbox
                            break
                        
                        else:
                            cur_frame_obj = match_frames[idx + frame_idx]
                            cur_location = cur_frame_obj["Champions"][label]["Position"]
                            if check_location_nearly_equal(location, cur_location):
                                is_loc_good_enough = True
                            else:
                                is_loc_good_enough = False
                                break

                    if is_loc_good_enough:
                        last_good_location = location.copy()
                        last_good_frame_idx = frame_idx
                        last_good_box = bbox

Validation of Results

To validate the results obtained by this method, I built a death location visualizer and a bounding box visualizer.

Death Location Comparison (Scatterplot)

Death Locations (Minimap)

Bounding Box Visualizer

Death Location Comparison (Scatterplot)

Death Locations (Minimap)

Bounding Box Visualizer

Postmortem

What Went Well

What Went Wrong

What I Learned

What Went Well

What Went Wrong

What I Learned