What I learnt from trying to reverse engineer a Drm Implementation

DRM stands for Digital Rights Management. It allows a software to hide information in plain sight from the end user, It's like the cake you can't have it.DRM implementation is what makes DRM a nightmare to crack. Some conglomerates like Google,Microsoft have already baked up their own Implementation for videos which is kinda hard to crack. Why? They have hundreds of engineers with financial backing and you know the rest.. But it aint impossible, October 2020 a github repository open sourced a chrome extension that had the private key of the CDM module of L3 Google Widewine library, CDM stands for Content Decryption Module, it is a compiled binary with lots of obfuscations that decrypts the license data for the video and uses it to decrypt the video file. It kinda made me wonder, So it aint difficult to break the drm if the implementation is weak.
I had a subscription to the site which kinda uses a proprietary DRM for the desktop browsing. Why specifically desktop browsing? DRM providers hate competition, for example Apple's Fairplay DRM can be only used on Apple's devices and you have to be the content owner to use the service, likewise Google's Widewine DRM uses DASH based streaming and encoding which Apple's devices wont support natively, also there are a lot restrictions on support for browsers too, third party browsers are totally not supported by a lot of DRM providers. DRM providers usually charge you for using their technology and the key exchange mechanisms and also ask you to sign a license agreement for use of their service and compliance. This kinda force a lot of video streaming companies especially where services are given to a range of devices such as Web, Android, Ios platforms to implement more than one DRM solution for the content. This kinda cripples the company to some extent.The site I had subscription decided to use a proprietary DRM solution for the web and Marlin DRM for mobile for their content streaming service.
I kinda from the day when I saw the github repo, had a hunch that if I can dedicate enough amount of time, I can learn a lot from it and possible crack and recover the decrypted data.So I Immediately dedicated some time to working on it. The first thing I did was to Inspect how the data is fetched and enumerate the drm solutions in place. So I opened up the site as well as developer console on Chrome. The first thing I see is a lot of javascript files with long name popping up in my network tab. [picture] I kinda figured that they must be using React for the web.Now I tried playing a video file and saw a licence request was being made and the response is a json file with base64 encoded data of license string and expiry timestamp and next was the manifest url getting fired in the network tab, waited till the processing took place and a url to the video was made, Like I expected the content was encrypted. [picture] So now I took a step back and looked at the manifest,to my surprise I didnt find any key UID called KID mentioned in the manifest, KID is Key identifier which the DRM Providers use to find the key, think it of like a hashtable where the KID is the key and key is the value. Now It raised my suspicion, If the License request was sent before the manifest and the manifest has no KID how are they processing the KEY?? and what does the license request contain?? I kinda assumed now that the license was some kinda of a key to decrypt the video and there was no actual drm key exchange mechanism! Guess work can be sometimes useful yet most of the time lead you in a rabbit hole. This time it proved useful. The license request was actually a encrypted key with some other data about the content like the course name and course id . After longtime un-minifying the js and seeing where the decryption is made, I was successfully able to recover the key , it was a 10 letter key. Now This raised even more questions, Am I really going the right path? I from few google searches know that video files are usually AES-128 encrypted for transport security and the key is 16 bytes long. Now I decided to do a lot more digging the code and trying to understand what the minified code was doing. There was a lot of stuffing the codet that got me lost sometimes but every time I took a break, I did traceback of steps in my mind which opened up new ways to tackle the approach. Finally I found out Indeed I was going the right path. Now I was stuck with a huge problem. I found that the decryption mechanism was was written in C and was compiled to Webassembly using Emscripten. After some research about Emscripten I was taken aback because now I cant use python to automate the process end-to-end but I need to use Nodejs to handle calling the function calls in the compiled decryptor.After a long murmer I decided to give it a shot, and after a long painstaking time to de minify the decryptor and find the function calls, I succeeded building one that actually decrypts the course content. The Important lessons I took are 1.If you come from a language where you can be sure the library will handle the binary type of response like python, you need to take a look at the various types of blobs the language will handle and how to make changes accordingly
- when dealing with blackboxes like the compiled decryptor, always make sure what type of data is sent in which format and what type of response is got in which format, also check for the type.
- Before you make assumptions check if you have enough knowledge on the subject to enumerate the correct possibilities.
- The more clear idea on the approach the more time you save.



