littlefs

mirror of https://github.com/littlefs-project/littlefs.git synced 2025-10-15 04:17:47 +08:00

Author	SHA1	Message	Date
Christopher Haster	d1b254da2c	Reverted removal of 1-bit counter threaded through tags Initially I thought the fcrc would be sufficient for all of the end-of-commit context, since indicating that there is a new commit is a simple as invalidating the fcrc. But it turns out there are cases that make this impossible. The surprising, and actually common, case, is that of an fcrc that will end up containing a full commit. This is common as soon as the prog_size is big, as small commits are padded to the prog_size at minimum. .------------------. \ \| metadata \| \| \| \| \| \| \| +-. \|------------------\| \| \| \| foward CRC ------------. \|------------------\| / \| \| \| commit CRC -----' \| \|------------------\| \| \| padding \| \| \| \| \| \|------------------\| \ \ \| \| metadata \| \| \| \| \| \| +-. \| \| \| \| \| \| +-' \|------------------\| / \| \| \| commit CRC --------' \| \|------------------\| \| \| \| / '------------------' When the commit + crc is all contained in the fcrc, something silly happens with the math behind crcs. Everything in the commit gets canceled out: crc(m) = m(x) x^\|P\|-1 mod P(x) m ++ crc(m) = m(x) x^\|P\|-1 + (m(x) x^\|P\|-1 mod P(x)) crc(m ++ crc(m)) = (m(x) x^\|P\|-1 + (m(x) x^\|P\|-1 mod P(x))) x^\|P\|-1 mod P(x) crc(m ++ crc(m)) = (m(x) x^\|P\|-1 + m(x) x^\|P\|-1) x^\|P\|-1 mod P(x) crc(m ++ crc(m)) = 0 * x^\|P\|-1 mod P(x) This is the reason the crc of a message + naive crc is zero. Even with an initializer/bit-fiddling, the crc of the whole commit ends up as some constant. So no manipulation of the commit can change the fcrc... But even if this did work, or we changed this scheme to use two different checksums, it would still require calculating the fcrc of the whole commit to know if we need to tweak the first bit to invalidate the unlikely-but-problematic case where we happen to match the fcrc. This would add a large amount of complexity to the commit code. It's much simpler and cheaper to keep the 1-bit counter in the tag, even if it adds another moving part to the system.	2022-12-17 12:42:05 -06:00
Christopher Haster	2f26966710	Continued implementation of forward-crcs, adopted new test runners This fixes most of the remaining bugs (except one with multiple padding commits + noop erases in test_badblocks), with some other code tweaks. The biggest change was dropping reliance on end-of-block commits to know when to stop parsing commits. We can just continue to parse tags and rely on the crc for catch bad commits, avoiding a backwards-compatiblity hiccup. So no new commit tag. Also renamed nprogcrc -> fcrc and commitcrc -> ccrc and made naming in the code a bit more consistent.	2022-12-17 12:42:05 -06:00
Christopher Haster	b4091c6871	Switched to separate-tag encoding of forward-looking CRCs Previously forward-looking CRCs was just two new CRC types, one for commits with forward-looking CRCs, one without. These both contained the CRC needed to complete the current commit (note that the commit CRC must come last!). [-- 32 --\|-- 32 --\|-- 32 --\|-- 32 --] with: [ crc3 tag \| nprog size \| nprog crc \| commit crc ] without: [ crc2 tag \| commit crc ] This meant there had to be several checks for the two possible structure sizes, messying up the implementation. [-- 32 --\|-- 32 --\|-- 32 --\|-- 32 --\|-- 32 --] with: [nprogcrc tag\| nprog size \| nprog crc \| commit tag \| commit crc ] without: [ commit tag \| commit crc ] But we already have a mechanism for storing optional metadata! The different metadata tags! So why not use a separate tage for the forward-looking CRC, separate from the commit CRC? I wasn't sure this would actually help that much, there are still necessary conditions for wether or not a forward-looking CRC is there, but in the end it simplified the code quite nicely, and resulted in a ~200 byte code-cost saving.	2022-12-17 12:42:05 -06:00
Christopher Haster	52dd83096b	Initial implementation of forward-looking erase-state CRCs This change is necessary to handle out-of-order writes found by pjsg's fuzzing work. The problem is that it is possible for (non-NOR) block devices to write pages in any order, or to even write random data in the case of a power-loss. This breaks littlefs's use of the first bit in a page to indicate the erase-state. pjsg notes this behavior is documented in the W25Q here: https://community.cypress.com/docs/DOC-10507 --- The basic idea here is to CRC the next page, and use this "erase-state CRC" to check if the next page is erased and ready to accept programs. .------------------. \ commit \| metadata \| \| \| \| +---. \| \| \| \| \|------------------\| \| \| \| erase-state CRC -----. \| \|------------------\| \| \| \| \| commit CRC ---\|-\|-' \|------------------\| / \| \| padding \| \| padding (doesn't need CRC) \| \| \| \|------------------\| \ \| next prog \| erased? \| +-' \| \| \| \| \| v \| / \| \| \| \| '------------------' This is made a bit annoying since littlefs doesn't actually store the page (prog_size) in the superblock, since it doesn't need to know the size for any other operation. We can work around this by storing both the CRC and size of the next page when necessary. Another interesting note is that we don't need to any bit tweaking information, since we read the next page every time we would need to know how to clobber the erase-state CRC. And since we only read prog_size, this works really well with our caching, since the caches must be a multiple of prog_size. This also brings back the internal lfs_bd_crc function, in which we can use some optimizations added to lfs_bd_cmp. Needs some cleanup but the idea is passing most relevant tests.	2022-12-17 12:42:05 -06:00
Christopher Haster	5137e4b0ba	Last minute tweaks to debug scripts - Standardized littlefs debug statements to use hex prefixes and brackets for printing pairs. - Removed the entry behavior for readtree and made -t the default. This is because 1. the CTZ skip-list parsing was broken, which is not surprising, and 2. the entry parsing was more complicated than useful. This functionality may be better implemented as a proper filesystem read script, complete with directory tree dumping. - Changed test.py's --gdb argument to take [init, main, assert], this matches the names of the stages in C's startup. - Added printing of tail to all mdir dumps in readtree/readmdir. - Added a print for if any mdirs are corrupted in readtree. - Added debug script side-effects to .gitignore.	2020-03-29 21:19:33 -05:00
Christopher Haster	a7dfae4526	Minor tweaks to debugging scripts, fixed explode_asserts.py off-by-1 - Changed readmdir.py to print the metadata pair and revision count, which is useful when debugging commit issues. - Added truncated data view to readtree.py by default. This does mean readtree.py must read all files on the filesystem to show the truncated data, hopefully this does not end up being a problem. - Made overall representation hopefully more readable, including moving superblock under the root dir, userattrs under files, fixing a gstate rendering issue. - Added rendering of soft-tails as dotted-arrows, hopefully this isn't too noisy. - Fixed explode_asserts.py off-by-1 in #line mapping caused by a strip call in the assert generation eating newlines. The script matches line numbers between the original+modified files by emitting assert statements that use the same number of lines. An off-by-1 here causes the entire file to map lines incorrectly, which can be very annoying.	2020-02-22 23:50:03 -06:00
Christopher Haster	6a550844f4	Modified readmdir/readtree to make reading non-truncated data easier Added indention so there was a more clear separation between the tag description and tag data. Also took the best parts of readmdir.py and added it to readtree.py. Initially I was thinking it was best for these to have completely independent data representations, since you could always call readtree to get more info, but this becomes tedius when needed to look at low-level tag info across multiple directories on the filesystem.	2020-02-09 12:00:23 -06:00
Christopher Haster	f4b6a6b328	Fixed issues with neighbor updates during moves The root of the problem was some assumptions about what tags could be sent to lfs_dir_commit. - The first assumption is that there could be only one splice (create/delete) tag at a time, which is trivially broken by the core commit in lfs_rename. - The second assumption is that there is at most one create and one delete in a single commit. This is less obvious but turns out to not be true in the case that we rename a file such that it overwrites another file in the same directory (1 delete for source file, 1 delete for destination). - The third assumption was that there was an ordering to the delete/creates passed to lfs_dir_commit. It may be possible to force all deletes to follow creates by rearranging the tags in lfs_rename, but this risks overflowing tag ids. The way the lfs_dir_commit first collected the "deletetag" and "createtag" broke all three of these assumptions. And because we lose the ordering information we can no longer apply the directory changes to open files correctly. The file ids may be shifted in a way that doesn't reflect the actual operations on disk. These problems were made worst by lfs_dir_commit cleaning up moves implicitly, which also creates deletes implicitly. While cleaning up moves in lfs_dir_commit may save some code size, it makes the commit logic much more difficult to implement correctly. This bug turned into pulling out a dead tree stump, roots and all. I ended up reworking how lfs_dir_commit updates open files so that it has less assumptions, now it just traverses the commit tags multiple times in order to update file ids after a successful commit in the correct order. This also got rid of the dir copy by carefully updating split dirs after all files have an up-to-date copy of the original dir. I also just removed the implicit move cleanup. It turns out the only commits that can occur before we have cleaned up the move is in lfs_fs_relocate, so it was simple enough to explicitly handle this case when we update our parent and pred during a relocate. Cases where we may need to fix moves: - In lfs_rename when we move a file/dir - In lfs_demove if we lose power - In lfs_fs_relocate if we have to relocate our parent and we find it had a pending move (or else the move will be outdated) - In lfs_fs_relocate if we have to relocate our predecessor and we find it had a pending move (or else the move will be outdated) Note the two cases in lfs_fs_relocate may be recursive. But lfs_fs_relocate can only trigger other lfs_fs_relocates so it's not possible for pending moves to spill out into other filesystem commits And of couse, I added several tests to cover these situations. Hopefully the rename-with-open-files logic should be fairly locked down now. found with initial fix by eastmoutain	2020-01-20 19:27:27 -06:00
Christopher Haster	9453ebd15d	Added/improved disk-reading debug scripts Also fixed a bug in dir splitting when there's a large number of open files, which was the main reason I was trying to make it easier to debug disk images. One part of the recent test changes was to move away from the file-per-block emubd and instead simulate storage with a single contiguous file. The file-per-block format was marginally useful at the beginning, but as the remaining bugs get more subtle, it becomes more useful to inspect littlefs through scripts that make the underlying metadata more human-readable. The key benefit of switching to a contiguous file is these same scripts can be reused for real disk images and can even read through /dev/sdb or similar. - ./scripts/readblock.py disk block_size block off data 00000000: 71 01 00 00 f0 0f ff f7 6c 69 74 74 6c 65 66 73 q.......littlefs 00000010: 2f e0 00 10 00 00 02 00 00 02 00 00 00 04 00 00 /............... 00000020: ff 00 00 00 ff ff ff 7f fe 03 00 00 20 00 04 19 ............... 00000030: 61 00 00 0c 00 62 20 30 0c 09 a0 01 00 00 64 00 a....b 0......d. ... readblock.py prints a hex dump of a given block on disk. It's basically just "dd if=disk bs=block_size count=1 skip=block \| xxd -g1 -" but with less typing. - ./scripts/readmdir.py disk block_size block1 block2 off tag type id len data (truncated) 0000003b: 0020000a dir 0 10 63 6f 6c 64 63 6f 66 66 coldcoff 00000049: 20000008 dirstruct 0 8 02 02 00 00 03 02 00 00 ........ 00000008: 00200409 dir 1 9 68 6f 74 63 6f 66 66 65 hotcoffe 00000015: 20000408 dirstruct 1 8 fe 01 00 00 ff 01 00 00 ........ readmdir.py prints info about the tags in a metadata pair on disk. It can print the currently active tags as well as the raw log of the metadata pair. - ./scripts/readtree.py disk block_size superblock "littlefs" version v2.0 block_size 512 block_count 1024 name_max 255 file_max 2147483647 attr_max 1022 gstate 0x000000000000000000000000 dir "/" mdir {0x0, 0x1} rev 3 v id 0 superblock "littlefs" inline size 24 mdir {0x77, 0x78} rev 1 id 0 dir "coffee" dir {0x1fc, 0x1fd} dir "/coffee" mdir {0x1fd, 0x1fc} rev 2 id 0 dir "coldcoffee" dir {0x202, 0x203} id 1 dir "hotcoffee" dir {0x1fe, 0x1ff} dir "/coffee/coldcoffee" mdir {0x202, 0x203} rev 1 dir "/coffee/warmcoffee" mdir {0x200, 0x201} rev 1 readtree.py parses the littlefs tree and prints info about the semantics of what's on disk. This includes the superblock, global-state, and directories/metadata-pairs. It doesn't print the filesystem tree though, that could be a different tool.	2020-01-20 19:27:27 -06:00

9 Commits