Jump to content
Sign in to follow this  
kion

Psp2i Model Fomats

Recommended Posts

 

Getting started with parsing models from psp2i. Right now dealing with individual models, so the next step will be to take the file.fpb file as an input and locate the NMLL archives.

  • Thumbs Up 5

Share this post


Link to post
Share on other sites

Okay, so some random thoughts/updates.

I tried to take the file.fpb file as an argument in the page, and the browser made it part of the way through and then promptly crashed. I don't think Javascript is suited for loading in a 829MB file, and then searching through the file for offsets, and then creating a new buffer for each offset. I might need to try something like asm.js so I can be more strict about how memory is handled (potentially)? Or I might need to use a script that will pre-process file.fpb to get it down to more manageable pieces.

For the nbl archives, I have a bunch of questions for how to handle them. There are two kinds of nbl archives, with with 0x10 at offset 5 and one with 0x00 at offset 5. We're able to unpack the nbl archives with 0x00 at offset 0x05. I'm hoping that maybe Agrajag or JamRules can provide more information on this format. Though we can at least read the filenames from the NMLL header for both of these types, so right now I can write a function that takes an nbl archive as an input, reads the file header, checks to see if the archive contains a .unj model file, and discards the nbl file if none are found, otherwise we attempt to decrypt and decompress the NMLL and TMLL body and store the archive.

The next thing I'm thinking about is grouping and how reliable the filenames are going to be. At first I was hoping that the developers created unique names for each asset, and that when names were re-used it was because the same asset was being cloned for a different scene. Though after seeing the number of duplicated filenames, it looks like they probably used the same name for different assets, but i'll have to do an md5 check to see if these are the same files or not. If the md5 sums match up, that means I can go ahead and extract everything out the the raw files, otherwise it might indicate that I'll have to group by nbl archive.

I think what I should be focusing on right now is to continue to work with nbl archive 8894 which contains the drops. I can set up my function to take the nbl file as an argument, extract and display the filenames, and then parse the textures and display the model on click.

Share this post


Link to post
Share on other sites

Small update got blowfish ported to vanilla Javascript. So it might not be an exciting picture to see some filenames in a console. But it's pretty exciting to see something like this working with vanilla Javascript as there are a few quirks to work with these files in the browser. Since the browser can't access the file system directly, you need to have the user select the file and then read it with the file reader API to get the raw data as an array buffer.

For blowfish specifically, it uses the properties of the uint32_t type which doesn't work the same as Javascript's built in variables. So to get around this you can declare a Uint32Array, which will then cause members of the array to act like a native programming language. And last the DataView API has been added, which allows for reading and writing different datatypes to an ArrayBuffer. And the result is being able to parse files directly in the browser.

Right now I just have the header, next step will be to decompress and save the files. 917849921_Screenshotfrom2019-09-3021-25-01.thumb.png.ce3c7d769a8690109247c15428777441.png

 

Edit:

 

1701698971_Screenshotfrom2019-10-0101-50-41.thumb.png.ec816b3bb2f7e92cd924ac03f20002ea.png

 

PRS is working too. Time to store the files to be used.

 

Edited by kion
small update
  • Thumbs Up 1

Share this post


Link to post
Share on other sites

Okay, so now files are unpacked from the nbl archive. And now the question is what the best way to store and sort them is. I added a few properties to try and make it easier to find things like extension, basename, added a shared unique id for files in an nbl archive, and added an md5. So what I think i will do is set filename + md5 as the primary key. And then to find the models i can search for .unj on the extension. And since I'm only interested in models, what I might do is filter out files from the nmll section that don't match .unj, .unt, .unv, or .unm.

 

457802583_Screenshotfrom2019-10-0108-53-45.thumb.png.66872f56b7bf82b1404a78d51b280613.png

  • Thumbs Up 1

Share this post


Link to post
Share on other sites

Okay, so managing the file.fpb as an input might be an over-reach, but it looks like I can now access nbl files for uploading. Which means I can either figure out a better way to implement fpb without crashing, or take a list of nbl files as input, which would require running a script to per-separate them, but would otherwise work.

701825523_Screenshot_2019-10-02Psp2Infinity.thumb.png.ec02b14b1f65f97a15938477f9998ab1.png

For the viewer side of thing, to keep the list of models from being too long, i split the models into groups by the first two letters of the model name. So far it seems to work for the limited test cases I've been working with so far. We'll have to see if the approach holds up.

Right now the model parsing side is still pretty limited. I guess I'll start with the simple drop models and work from there. Though before working on models I think I'll go ahead and implement textures to get those out of the way first. Because from what I've seen so far, the texture implementation is surprisingly straight forward.

  • Thumbs Up 1

Share this post


Link to post
Share on other sites

Okay, so now that we can import assets and store them to be referenced later, the next step is to start working on parsing the files. So the way I have it now is that unj files are split into categories and then listed on the left. When a unj file is clicked, we can then load it and parse it.

So I did some testing with nodejs as far as converting textures to pngs, and it looks like the texture format is pretty straight forward. So before we get into the models, we can take a second to make sure the textures are working, and that should help us later as far as identifying which assets are what for the models, which can help with debugging.

So to be able to know which texture needs to be read for what model, we need to reference the texture list, which will tell us which textures need to be linked with what materials. And thankfully they re-used the NJTL ninja texture list format from the Dreamcast days. Possibility with a slight modification, but looking at the file, it's pretty easy to trace through.

1217405609_Screenshotfrom2019-10-0223-48-33.png.ef2c78b7727e2deaad33f10e4bc4fa59.png

So first the magic number NUTL, followed by the length. Followed by the pointer to the header which is on 0x24 in this case. Then the header is the number of textures, followed by a pointer to the first texture struct, which is on 0x10. 0x10 and 0x14 are both zero. And this is because these values are use for attributes after the texture has been loaded into RAM, but not before. And that just leaves a pointer to the start of the texture which is 0x2c.

So this is only for a .unt file with one texture. So to get the exact format of the header, we'll have to find a texture list that references more than one texture, but for now we have something that works. We can get the name of the texture, load it, parse it, and then have it ready to be paired with the model geometry when that gets loaded.

  • Thumbs Up 2

Share this post


Link to post
Share on other sites

Small update. Textures have been added. Right now I'm throwing the textures somewhere (right now outlined in red), so check that the texture has been parsed correctly. I should probably add another window to the UI that can be used for viewing textures that have been loaded in for a model. To save on time, I might make a white box and stick it under the 3d viewport, and then move on to setting the textures to the map diffuse uv.

74605824_Screenshot_2019-10-04Psp2Infinity.thumb.png.26be2e554eebdba7a94ade540ccbb81d.png

 

  • Thumbs Up 2

Share this post


Link to post
Share on other sites

We're back to something that resembles the first post with a few improvements. And the main improvements are that we're not unpacking and then stores but, but we can take archives and then read the files directly from there. Textures have also been implemented, and I fixed the vertex wind order for the faces.

738288901_Screenshot_2019-10-05Psp2Infinity.thumb.png.6bc17acb265f65b85df119923b8442e9.png

Next step is to test the meshes that are included in the basic archive and then start scale up from there.

  • Thumbs Up 1

Share this post


Link to post
Share on other sites

Oh look a different model. Now that we are able to read archives and textures, the next step is to track down different vertex lists to track down which flags do what. The game seems to group each vertex is in the order of diffuse map uv, color, and position. And then they also seem to use 32 bit float and 16 bit ints to define values. So the flag defines what types of attributes are in the vertex list and what format they are in. So basically we find a model, debug the vertex list, find another model and debug again.

1969981165_Screenshot_2019-10-06Psp2Infinity.thumb.png.4c2a15814b47a7395a7df34d94eaf497.png

So to start out we have the calorie mate item, which I figured would be a good place to start with the 16 bit values. We have 16 bit uv values and 16 vertices. I don't think I have the exact scale for the vertices and uv values, but I'm surprised with how close I got by plugging in likely values.

Share this post


Link to post
Share on other sites

Grabbing more files from the common models archive. Time to start looking into other archives for test cases. Two notes: nmll-09283.nbl is broken, and nmll-08904.nbl might be a good test candidate for the next set of vertex flag testing.

Screenshot_2019-10-07 Psp2 Infinity.png

Share this post


Link to post
Share on other sites

I realized what I was missing in life: tables.

2131898063_Screenshot_2019-11-10PhantasyStarUniverseSeriesPhotonDrop.png.15d5bcb7f44e798c24c0062aaaa97657.png

So I'm not very knowledgeable about PSU file formats as I haven't had a chance to dig into it at all. But I was able to ask someone who knows a lot more than me: Argajag about the file formats. It looks like the types of files used boils down to a couple of key differences. For models PSU uses the .xja format and PSP uses the .unj format. I think Agrajag also mention that the model animations, texture animations, bone names and texture names have different extensions for PSU and PSP but the file formats are the same for all of them.

And then come the differences in archives. All of the titles seem to use the .nbl archive format. Which I think stands for "new-bml" if anyone is familiar with the PSO .bml archive type. It seems to have a similar roll and structure and NMLL section for the model-related files and a TMLL section for the texture files. The texture files seem to have some variation with .xvr on PSU, .gim on PSP1 and .uvr on PSP2(i). And then for the file system PSU uses the computer's folder's directly, so all of the game's files are in the /DATA folder. For the PSP titles, all of the games have 'file.fpb' file, which is around 800MB and has all of the assets in the game stored as a list of .nbl archives.

As for the differences between the PSP titles, the basic files such as the models and animations seem to have remained the same. The main difference on the raw files is between the .gim files and .uvr files for the textures. And it looks like PSP2(i) contains a few .gim files that were likely re-used from PSP1, but aside from that the vast majority are .uvr format. Otherwise the main difference between PSP1 and PSP2(i) is obfuscation. The fpb and nbl file types where changed in an attempt to make them more difficult to access.

So in terms of the portable tiles, the list of file types seems to break down into .unj, .unm, gim, .uvr, .una, .unt, .nbl and .fpb.

For fpb files, it's kind of frustrating that I don't have a better method other than the "slice" method of looking for the NMLL header and writing out the space in-between for each write. Having some more information for how the game manages the header and how it stores the file names would definitely make navigating the file system on the portable titles a lot easier. And right now I'm running into an issue on PSP1 where my extractor is running into an error on PSP1, so I'll have to track down the reason for that to see what's going on. After that I can get back to focusing on the model format and taking more notes.

Edit:

Okay the reason for the nbl file wasn't working was an empty nbl file that had a header but no content. So that means I can get back to looking at the basic file types. As a side note beign able to get the filenames from fpb files and type 2 nbl files would make working with the portable series a lot easier.

Edited by kion
fix things

Share this post


Link to post
Share on other sites

Right now I don't have too much to write about in terms of the FPB tarball since I'm simply using the "look for magic numbers and slice" method. It seems like there should be a cleaner approach that involves reading the file header and then finding the offsets in the file. I've been taking a second look at NBL and I think I can write some notes about that.

nmll_layout.thumb.png.5094fc12ec7f428c2573d17e4fcac2d6.png

The figure above shows the outline of the structure of an NBL archive. For NBL I'm guessing the extension stands for "New BML" as it has a similar function and approach to PSO's BML archive format, but in implementation a lot has been changed. In terms of function it acts the same grouping a list of models, and files associated with the models such as animations, and then has the textures for those models all included in a single grouped compressed file. In terms of structure the way it does this is somewhat different.

NBL groups information with two main sections. First is the NMLL section which contains any kind of information that isn't textures. So the content is anything from models to animations, texture lists, and animation. The second section is the TMLL section which contains the textures. Each section has it's own respective header and body. Which are essentially the same format. The NMLL header has 0x30 bytes of meta data at the top of the header followed by the file list, while the TMLL section has 0x20 bytes followed by the follow list. And the body is the same for both sections, can be compressed, and can be encrypted, and in that order. So if the body is uncompressed then decompress (with PRS for type 0) and then decrypt is encrypted (with blowfish). Once decompressed and encrypted the files are stacked end to end and the respective offsets and lengths are provided in the respective headers.

The main difference between the two sections is that texture data is likely copied directly into the framebuffer. So the archive only needs to unpack the files and then copy the information directly into the framebuffer. The NMLL section of the archive gets copied as a block directly into memory, and the archive is implemented so that all pointers in the NMLL files are zero relative to the start of the body. And then following the body is a list of pointer offsets inside the body, with the reasoning being that it allows for all of the pointers to be easily adjusted to the location copied into memory.

In terms of approach for extracting the files from an NBL the approach is a little messy. First read the meta data in the header to try and find the number of files in each respective section, the length of each header in bytes, the compressed size of each body and then the decompressed size of each body. And then get the key and initialize blowfish on the condition the files are encrypted.

Then it seems like the next step is to try and seek through the file to try and find where each segment is and slice it into it's own section of information. The reason for using slicing is kind of two-fold. The main reason is that it's unclear if there is a pattern for the amount of padding between each group of information. So after the NMLL header there is zero padding before the body, and then the start of the body is the first non-zero bytes following. And that seems to hold true for all of the segments, after the NMLL body is an unknown amount of zero padding followed by the pointer list. And then the TMLL section is optional and doesn't exist in a lot of cases. And then when it does exist it's position needs to be found by looking for the TMLL magic number. And which is then followed by more padding and then the TMLL body.

So the most difficultly for handling this function generally comes from handling the different conditions and seeking through to find the different segments. Once each segment has been isolated, you decrypt and read the header. Then decompress and decrypt the body, and then in the case of NMLL adjust the pointers, and then pull out the files. Seems like something that could be addressed by adding a few offset pointers in the headers, but at best I can only speculate at the potential reasons behind this. The best reason that I can think of is that the archive is intended to be more of a streaming format, where the archive is intended to start at the beginning and seek through the file copying information into memory as it goes along. Though having a few more offsets declared in the header would make this format much easier to work with.

  • Thumbs Up 1

Share this post


Link to post
Share on other sites

1795028326_Screenshot_2019-11-13formatnblPhantasyStarOnlineDevelopersWiki.png.dc8eb9e8e5b3d6e903f98c453ac8484f.png

Continuing on with NBL archives. I've been trying to think of ways to simplify the code and get around the lack of offsets declared in the header. But I guess if they're actually needed or not is a little debatable, since after once segment has been read, the next section is found by scanning instead of seeking. I think it's really the combination of having to scan paired with not knowing the exact format of what's contained in the archive. Maybe writing out an approach will help me simplify th code that I have.

So to start out, it seems like a good idea to read 0x00 as a quick sanity check to check for the NMLL magic and return if it's not found. Then we're probably going to want to read 0x04 for the compression format and then 0x1c to get the seed for blowfish. And that way we'll have those parameters for working with the rest of the archive.

Next we're going to need 0x08 to get the byte length of the NMLL header and then 0x0c for the number of files in the NMLL section. With that we can decrypt the NMLL header to get the filenames, offsets and length of all of the files contained. Then we can start from the end of the NMLL header and seek until the first non-zero byte. This will be the NMLL body, and if we can't find this section, then we return. If there is a body then we decompress and decrypt if needed. Then we continue to seek to the first non-zero byte after the body. This will be the pointer table. So once we have the pointer table and the raw NMLL body, then we can go ahead and unpack those files and add them to the files to be returned.

Once we've hit the end of the end of the pointer table then we seek until we find the TMLL header and return if it doesn't exist. If it does exist then we read 0x20 to get the header size and then 0x2c to get the number of textures in the section. We then decrypt the header and get the filenames and offsets. Then we seek to the next non-zero byte after the header, get the body, decompress. And then unpack and add the textures to the files to be returned. And then we return the unpacked files from the archive.

  • Thumbs Up 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...