我需要一种方法将一些文件簇插入到文件的中间以插入一些数据。
通常情况下,我只是读取整个文件并将其重新写回,但这些文件的大小是几千兆字节,只需要30分钟即可读取文件并将其重新写回。
簇的大小不会打扰我; 我基本上可以写出零到我的插入群集的结尾,它仍然会以这种文件格式工作。
如何使用Windows File API(或其他一些机制)修改文件的文件分配表,在文件中间的指定位置插入一个或多个未使用的集群?
[编辑:]
Blah–我要说“这是不可行的,至少不通过MFT修改,没有很多的痛苦”; 首先,NTFS MFT结构本身并不是100%“开放”的,所以我开始钻研逆向工程领域,这个领域有我无心去处理的法律后果。 另外,在.NET中这样做是一个非常繁琐的基于大量猜测的映射和编组结构的过程(并且不要让我开始介绍大多数MFT结构以奇怪的方式被压缩的事实)。 简短的故事,虽然我非常了解NTFS“如何运作”,但我并没有接近解决这个问题的办法。
[/编辑]
呃… sooo多编组废话….
这让我感到“有趣”,所以我不得不在这个问题上徘徊……这仍然是一个“正在进行中的答案”,但是我想把所有的东西都放在帮助他人想出什么东西上。 🙂
此外,我有一个粗略的感觉,这将是更容易FAT32,但鉴于我只有NTFS的工作… …
所以 – 很多 pinvoking和marshalling,所以让我们从那里开始并向后工作:
正如人们所猜测的那样,标准的.NET File / IO apis在这里帮不了您的忙 – 我们需要设备级访问:
[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto)] static extern SafeFileHandle CreateFile( string lpFileName, [MarshalAs(UnmanagedType.U4)] FileAccess dwDesiredAccess, [MarshalAs(UnmanagedType.U4)] FileShare dwShareMode, IntPtr lpSecurityAttributes, [MarshalAs(UnmanagedType.U4)] FileMode dwCreationDisposition, [MarshalAs(UnmanagedType.U4)] FileAttributes dwFlagsAndAttributes, IntPtr hTemplateFile); [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)] public static extern bool ReadFile( SafeFileHandle hFile, // handle to file byte[] pBuffer, // data buffer, should be fixed int NumberOfBytesToRead, // number of bytes to read IntPtr pNumberOfBytesRead, // number of bytes read, provide NULL here ref NativeOverlapped lpOverlapped // should be fixed, if not null ); [DllImport("coreel32.dll", SetLastError = true, CharSet = CharSet.Auto)] public static extern bool SetFilePointerEx( SafeFileHandle hFile, long liDistanceToMove, out long lpNewFilePointer, SeekOrigin dwMoveMethod);
我们将这样使用这些讨厌的win32野兽:
// To the metal, baby! using (var fileHandle = NativeMethods.CreateFile( // Magic "give me the device" syntax @"\\.\c:", // MUST explicitly provide both of these, not ReadWrite FileAccess.Read | FileAccess.Write, // MUST explicitly provide both of these, not ReadWrite FileShare.Write | FileShare.Read, IntPtr.Zero, FileMode.Open, FileAttributes.Normal, IntPtr.Zero)) { if (fileHandle.IsInvalid) { // Doh! throw new Win32Exception(); } else { // Boot sector ~ 512 bytes long byte[] buffer = new byte[512]; NativeOverlapped overlapped = new NativeOverlapped(); NativeMethods.ReadFile(fileHandle, buffer, buffer.Length, IntPtr.Zero, ref overlapped); // Pin it so we can transmogrify it into a FAT structure var handle = GCHandle.Alloc(buffer, GCHandleType.Pinned); try { // note, I've got an NTFS drive, change yours to suit var bootSector = (BootSector_NTFS)Marshal.PtrToStructure( handle.AddrOfPinnedObject(), typeof(BootSector_NTFS));
哇, BootSector_NTFS
– 什么是BootSector_NTFS
? 这是一个字节映射的struct
,可以像我想象的那样接近NTFS结构(包括FAT32):
[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi, Pack=0)] public struct JumpBoot { [MarshalAs(UnmanagedType.ByValArray, ArraySubType=UnmanagedType.U1, SizeConst=3)] public byte[] BS_jmpBoot; [MarshalAs(UnmanagedType.ByValTStr, SizeConst=8)] public string BS_OEMName; } [StructLayout(LayoutKind.Explicit, CharSet = CharSet.Ansi, Pack = 0, Size = 90)] public struct BootSector_NTFS { [FieldOffset(0)] public JumpBoot JumpBoot; [FieldOffset(0xb)] public short BytesPerSector; [FieldOffset(0xd)] public byte SectorsPerCluster; [FieldOffset(0xe)] public short ReservedSectorCount; [FieldOffset(0x10)] [MarshalAs(UnmanagedType.ByValArray, SizeConst = 5)] public byte[] Reserved0_MUSTBEZEROs; [FieldOffset(0x15)] public byte BPB_Media; [FieldOffset(0x16)] public short Reserved1_MUSTBEZERO; [FieldOffset(0x18)] public short SectorsPerTrack; [FieldOffset(0x1A)] public short HeadCount; [FieldOffset(0x1c)] public int HiddenSectorCount; [FieldOffset(0x20)] public int LargeSectors; [FieldOffset(0x24)] public int Reserved6; [FieldOffset(0x28)] public long TotalSectors; [FieldOffset(0x30)] public long MftClusterNumber; [FieldOffset(0x38)] public long MftMirrorClusterNumber; [FieldOffset(0x40)] public byte ClustersPerMftRecord; [FieldOffset(0x41)] public byte Reserved7; [FieldOffset(0x42)] public short Reserved8; [FieldOffset(0x44)] public byte ClustersPerIndexBuffer; [FieldOffset(0x45)] public byte Reserved9; [FieldOffset(0x46)] public short ReservedA; [FieldOffset(0x48)] [MarshalAs(UnmanagedType.ByValArray, SizeConst = 8)] public byte[] SerialNumber; [FieldOffset(0x50)] public int Checksum; [FieldOffset(0x54)] [MarshalAs(UnmanagedType.ByValArray, SizeConst = 0x1AA)] public byte[] BootupCode; [FieldOffset(0x1FE)] public ushort EndOfSectorMarker; public long GetMftAbsoluteIndex(int recordIndex = 0) { return (BytesPerSector * SectorsPerCluster * MftClusterNumber) + (GetMftEntrySize() * recordIndex); } public long GetMftEntrySize() { return (BytesPerSector * SectorsPerCluster * ClustersPerMftRecord); } } // Note: dont have fat32, so can't verify all these...they *should* work, tho // refs: // http://www.pjrc.com/tech/8051/ide/fat32.html // http://msdn.microsoft.com/en-US/windows/hardware/gg463084 [StructLayout(LayoutKind.Explicit, CharSet=CharSet.Auto, Pack=0, Size=90)] public struct BootSector_FAT32 { [FieldOffset(0)] public JumpBoot JumpBoot; [FieldOffset(11)] public short BPB_BytsPerSec; [FieldOffset(13)] public byte BPB_SecPerClus; [FieldOffset(14)] public short BPB_RsvdSecCnt; [FieldOffset(16)] public byte BPB_NumFATs; [FieldOffset(17)] public short BPB_RootEntCnt; [FieldOffset(19)] public short BPB_TotSec16; [FieldOffset(21)] public byte BPB_Media; [FieldOffset(22)] public short BPB_FATSz16; [FieldOffset(24)] public short BPB_SecPerTrk; [FieldOffset(26)] public short BPB_NumHeads; [FieldOffset(28)] public int BPB_HiddSec; [FieldOffset(32)] public int BPB_TotSec32; [FieldOffset(36)] public FAT32 FAT; } [StructLayout(LayoutKind.Sequential)] public struct FAT32 { public int BPB_FATSz32; public short BPB_ExtFlags; public short BPB_FSVer; public int BPB_RootClus; public short BPB_FSInfo; public short BPB_BkBootSec; [MarshalAs(UnmanagedType.ByValArray, SizeConst=12)] public byte[] BPB_Reserved; public byte BS_DrvNum; public byte BS_Reserved1; public byte BS_BootSig; public int BS_VolID; [MarshalAs(UnmanagedType.ByValTStr, SizeConst=11)] public string BS_VolLab; [MarshalAs(UnmanagedType.ByValTStr, SizeConst=8)] public string BS_FilSysType; }
所以现在我们可以把整个mess'o'bytes映射回这个结构:
// Pin it so we can transmogrify it into a FAT structure var handle = GCHandle.Alloc(buffer, GCHandleType.Pinned); try { // note, I've got an NTFS drive, change yours to suit var bootSector = (BootSector_NTFS)Marshal.PtrToStructure( handle.AddrOfPinnedObject(), typeof(BootSector_NTFS)); Console.WriteLine( "I think that the Master File Table is at absolute position:{0}, sector:{1}", bootSector.GetMftAbsoluteIndex(), bootSector.GetMftAbsoluteIndex() / bootSector.BytesPerSector);
在这一点上输出:
I think that the Master File Table is at absolute position:3221225472, sector:6291456
让我们确认,快速使用OEM支持工具nfi.exe
:
C:\tools\OEMTools\nfi>nfi c: NTFS File Sector Information Utility. Copyright (C) Microsoft Corporation 1999. All rights reserved. File 0 Master File Table ($Mft) $STANDARD_INFORMATION (resident) $FILE_NAME (resident) $DATA (nonresident) logical sectors 6291456-6487039 (0x600000-0x62fbff) logical sectors 366267960-369153591 (0x15d4ce38-0x1600d637) $BITMAP (nonresident) logical sectors 6291448-6291455 (0x5ffff8-0x5fffff) logical sectors 7273984-7274367 (0x6efe00-0x6eff7f)
酷,看起来像我们正在正确的轨道…前进!
// If you've got LinqPad, uncomment this to look at boot sector bootSector.Dump(); Console.WriteLine("Jumping to Master File Table..."); long lpNewFilePointer; if (!NativeMethods.SetFilePointerEx( fileHandle, bootSector.GetMftAbsoluteIndex(), out lpNewFilePointer, SeekOrigin.Begin)) { throw new Win32Exception(); } Console.WriteLine("Position now: {0}", lpNewFilePointer); // Read in one MFT entry byte[] mft_buffer = new byte[bootSector.GetMftEntrySize()]; Console.WriteLine("Reading $MFT entry...calculated size: 0x{0}", bootSector.GetMftEntrySize().ToString("X")); var seekIndex = bootSector.GetMftAbsoluteIndex(); overlapped.OffsetHigh = (int)(seekIndex >> 32); overlapped.OffsetLow = (int)seekIndex; NativeMethods.ReadFile( fileHandle, mft_buffer, mft_buffer.Length, IntPtr.Zero, ref overlapped); // Pin it for transmogrification var mft_handle = GCHandle.Alloc(mft_buffer, GCHandleType.Pinned); try { var mftRecords = (MFTSystemRecords)Marshal.PtrToStructure( mft_handle.AddrOfPinnedObject(), typeof(MFTSystemRecords)); mftRecords.Dump(); } finally { // make sure we clean up mft_handle.Free(); } } finally { // make sure we clean up handle.Free(); }
呃,更多的本土结构来讨论 – 所以MFT的安排,使前16个条目是“固定的”:
[StructLayout(LayoutKind.Sequential)] public struct MFTSystemRecords { public MFTRecord Mft; public MFTRecord MftMirror; public MFTRecord LogFile; public MFTRecord Volume; public MFTRecord AttributeDefs; public MFTRecord RootFile; public MFTRecord ClusterBitmap; public MFTRecord BootSector; public MFTRecord BadClusterFile; public MFTRecord SecurityFile; public MFTRecord UpcaseTable; public MFTRecord ExtensionFile; [MarshalAs(UnmanagedType.ByValArray, SizeConst = 16)] public MFTRecord[] MftReserved; public MFTRecord MftFileExt; }
MFTRecord
是:
[StructLayout(LayoutKind.Sequential, Size = 1024)] public struct MFTRecord { const int BASE_RECORD_SIZE = 48; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 4)] public string Type; public short UsaOffset; public short UsaCount; public long Lsn; /* $LogFile sequence number for this record. Changed every time the record is modified. */ public short SequenceNumber; /* # of times this record has been reused */ public short LinkCount; /* Number of hard links, ie the number of directory entries referencing this record. */ public short AttributeOffset; /* Byte offset to the first attribute in this mft record from the start of the mft record. */ public short MftRecordFlags; public int BytesInUse; public int BytesAllocated; public long BaseFileRecord; public short NextAttributeNumber; public short Reserved; public int MftRecordNumber; [MarshalAs(UnmanagedType.ByValArray, SizeConst = 976)] public byte[] Data; public byte[] SetData { get { return this.Data .Skip(AttributeOffset - BASE_RECORD_SIZE) .Take(BytesInUse - BASE_RECORD_SIZE) .ToArray(); } } public MftAttribute[] Attributes { get { var idx = 0; var ret = new List<MftAttribute>(); while (idx < SetData.Length) { var attr = MftAttribute.FromBytes(SetData.Skip(idx).ToArray()); ret.Add(attr); idx += attr.Attribute.Length; // A special "END" attribute denotes the end of the list if (attr.Attribute.AttributeType == MftAttributeType.AT_END) break; } return ret.ToArray(); } } }
而且…这里是我现在要去的地方; 主要是因为我想吃晚饭等。 不过,我会回来的。
参考(部分是为了我自己的记忆,部分是为了协助其他调查人员)
完整的代码转储a'follow:
我上面所有的原生映射(由于邮政大小的限制,而不是一个完整的rehash):
public enum MftRecordFlags : ushort { MFT_RECORD_IN_USE = 0x0001, MFT_RECORD_IS_DIRECTORY = 0x0002, MFT_RECORD_IN_EXTEND = 0x0004, MFT_RECORD_IS_VIEW_INDEX = 0x0008, MFT_REC_SPACE_FILLER = 0xffff } public enum MftAttributeType : uint { AT_UNUSED = 0, AT_STANDARD_INFORMATION = 0x10, AT_ATTRIBUTE_LIST = 0x20, AT_FILENAME = 0x30, AT_OBJECT_ID = 0x40, AT_SECURITY_DESCRIPTOR = 0x50, AT_VOLUME_NAME = 0x60, AT_VOLUME_INFORMATION = 0x70, AT_DATA = 0x80, AT_INDEX_ROOT = 0x90, AT_INDEX_ALLOCATION = 0xa0, AT_BITMAP = 0xb0, AT_REPARSE_POINT = 0xc0, AT_EA_INFORMATION = 0xd0, AT_EA = 0xe0, AT_PROPERTY_SET = 0xf0, AT_LOGGED_UTILITY_STREAM = 0x100, AT_FIRST_USER_DEFINED_ATTRIBUTE = 0x1000, AT_END = 0xffffffff } public enum MftAttributeDefFlags : byte { ATTR_DEF_INDEXABLE = 0x02, /* Attribute can be indexed. */ ATTR_DEF_MULTIPLE = 0x04, /* Attribute type can be present multiple times in the mft records of an inode. */ ATTR_DEF_NOT_ZERO = 0x08, /* Attribute value must contain at least one non-zero byte. */ ATTR_DEF_INDEXED_UNIQUE = 0x10, /* Attribute must be indexed and the attribute value must be unique for the attribute type in all of the mft records of an inode. */ ATTR_DEF_NAMED_UNIQUE = 0x20, /* Attribute must be named and the name must be unique for the attribute type in all of the mft records of an inode. */ ATTR_DEF_RESIDENT = 0x40, /* Attribute must be resident. */ ATTR_DEF_ALWAYS_LOG = 0x80, /* Always log modifications to this attribute, regardless of whether it is resident or non-resident. Without this, only log modifications if the attribute is resident. */ } [StructLayout(LayoutKind.Explicit)] public struct MftInternalAttribute { [FieldOffset(0)] public MftAttributeType AttributeType; [FieldOffset(4)] public int Length; [FieldOffset(8)] [MarshalAs(UnmanagedType.Bool)] public bool NonResident; [FieldOffset(9)] public byte NameLength; [FieldOffset(10)] public short NameOffset; [FieldOffset(12)] public int AttributeFlags; [FieldOffset(14)] public short Instance; [FieldOffset(16)] public ResidentAttribute ResidentAttribute; [FieldOffset(16)] public NonResidentAttribute NonResidentAttribute; } [StructLayout(LayoutKind.Sequential)] public struct ResidentAttribute { public int ValueLength; public short ValueOffset; public byte ResidentAttributeFlags; public byte Reserved; public override string ToString() { return string.Format("{0}:{1}:{2}:{3}", ValueLength, ValueOffset, ResidentAttributeFlags, Reserved); } } [StructLayout(LayoutKind.Sequential)] public struct NonResidentAttribute { public long LowestVcn; public long HighestVcn; public short MappingPairsOffset; public byte CompressionUnit; [MarshalAs(UnmanagedType.ByValArray, SizeConst = 5)] public byte[] Reserved; public long AllocatedSize; public long DataSize; public long InitializedSize; public long CompressedSize; public override string ToString() { return string.Format("{0}:{1}:{2}:{3}:{4}:{5}:{6}:{7}", LowestVcn, HighestVcn, MappingPairsOffset, CompressionUnit, AllocatedSize, DataSize, InitializedSize, CompressedSize); } } public struct MftAttribute { public MftInternalAttribute Attribute; [field: NonSerialized] public string Name; [field: NonSerialized] public byte[] Data; [field: NonSerialized] public object Payload; public static MftAttribute FromBytes(byte[] buffer) { var hnd = GCHandle.Alloc(buffer, GCHandleType.Pinned); try { var attr = (MftInternalAttribute)Marshal.PtrToStructure(hnd.AddrOfPinnedObject(), typeof(MftInternalAttribute)); var ret = new MftAttribute() { Attribute = attr }; ret.Data = buffer.Skip(Marshal.SizeOf(attr)).Take(attr.Length).ToArray(); if (ret.Attribute.AttributeType == MftAttributeType.AT_STANDARD_INFORMATION) { var payloadHnd = GCHandle.Alloc(ret.Data, GCHandleType.Pinned); try { var payload = (MftStandardInformation)Marshal.PtrToStructure(payloadHnd.AddrOfPinnedObject(), typeof(MftStandardInformation)); ret.Payload = payload; } finally { payloadHnd.Free(); } } return ret; } finally { hnd.Free(); } } } [StructLayout(LayoutKind.Sequential)] public struct MftStandardInformation { public ulong CreationTime; public ulong LastDataChangeTime; public ulong LastMftChangeTime; public ulong LastAccessTime; public int FileAttributes; public int MaximumVersions; public int VersionNumber; public int ClassId; public int OwnerId; public int SecurityId; public long QuotaChanged; public long Usn; } // Note: dont have fat32, so can't verify all these...they *should* work, tho // refs: // http://www.pjrc.com/tech/8051/ide/fat32.html // http://msdn.microsoft.com/en-US/windows/hardware/gg463084 [StructLayout(LayoutKind.Explicit, CharSet = CharSet.Auto, Pack = 0, Size = 90)] public struct BootSector_FAT32 { [FieldOffset(0)] public JumpBoot JumpBoot; [FieldOffset(11)] public short BPB_BytsPerSec; [FieldOffset(13)] public byte BPB_SecPerClus; [FieldOffset(14)] public short BPB_RsvdSecCnt; [FieldOffset(16)] public byte BPB_NumFATs; [FieldOffset(17)] public short BPB_RootEntCnt; [FieldOffset(19)] public short BPB_TotSec16; [FieldOffset(21)] public byte BPB_Media; [FieldOffset(22)] public short BPB_FATSz16; [FieldOffset(24)] public short BPB_SecPerTrk; [FieldOffset(26)] public short BPB_NumHeads; [FieldOffset(28)] public int BPB_HiddSec; [FieldOffset(32)] public int BPB_TotSec32; [FieldOffset(36)] public FAT32 FAT; } [StructLayout(LayoutKind.Sequential)] public struct FAT32 { public int BPB_FATSz32; public short BPB_ExtFlags; public short BPB_FSVer; public int BPB_RootClus; public short BPB_FSInfo; public short BPB_BkBootSec; [MarshalAs(UnmanagedType.ByValArray, SizeConst = 12)] public byte[] BPB_Reserved; public byte BS_DrvNum; public byte BS_Reserved1; public byte BS_BootSig; public int BS_VolID; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 11)] public string BS_VolLab; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)] public string BS_FilSysType; }
测试设备:
class Program { static void Main(string[] args) { // To the metal, baby! using (var fileHandle = NativeMethods.CreateFile( // Magic "give me the device" syntax @"\\.\c:", // MUST explicitly provide both of these, not ReadWrite FileAccess.Read | FileAccess.Write, // MUST explicitly provide both of these, not ReadWrite FileShare.Write | FileShare.Read, IntPtr.Zero, FileMode.Open, FileAttributes.Normal, IntPtr.Zero)) { if (fileHandle.IsInvalid) { // Doh! throw new Win32Exception(); } else { // Boot sector ~ 512 bytes long byte[] buffer = new byte[512]; NativeOverlapped overlapped = new NativeOverlapped(); NativeMethods.ReadFile(fileHandle, buffer, buffer.Length, IntPtr.Zero, ref overlapped); // Pin it so we can transmogrify it into a FAT structure var handle = GCHandle.Alloc(buffer, GCHandleType.Pinned); try { // note, I've got an NTFS drive, change yours to suit var bootSector = (BootSector_NTFS)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(BootSector_NTFS)); Console.WriteLine( "I think that the Master File Table is at absolute position:{0}, sector:{1}", bootSector.GetMftAbsoluteIndex(), bootSector.GetMftAbsoluteIndex() / bootSector.BytesPerSector); Console.WriteLine("MFT record size:{0}", bootSector.ClustersPerMftRecord * bootSector.SectorsPerCluster * bootSector.BytesPerSector); // If you've got LinqPad, uncomment this to look at boot sector bootSector.DumpToHtmlString(); Pause(); Console.WriteLine("Jumping to Master File Table..."); long lpNewFilePointer; if (!NativeMethods.SetFilePointerEx(fileHandle, bootSector.GetMftAbsoluteIndex(), out lpNewFilePointer, SeekOrigin.Begin)) { throw new Win32Exception(); } Console.WriteLine("Position now: {0}", lpNewFilePointer); // Read in one MFT entry byte[] mft_buffer = new byte[bootSector.GetMftEntrySize()]; Console.WriteLine("Reading $MFT entry...calculated size: 0x{0}", bootSector.GetMftEntrySize().ToString("X")); var seekIndex = bootSector.GetMftAbsoluteIndex(); overlapped.OffsetHigh = (int)(seekIndex >> 32); overlapped.OffsetLow = (int)seekIndex; NativeMethods.ReadFile(fileHandle, mft_buffer, mft_buffer.Length, IntPtr.Zero, ref overlapped); // Pin it for transmogrification var mft_handle = GCHandle.Alloc(mft_buffer, GCHandleType.Pinned); try { var mftRecords = (MFTSystemRecords)Marshal.PtrToStructure(mft_handle.AddrOfPinnedObject(), typeof(MFTSystemRecords)); mftRecords.DumpToHtmlString(); } finally { // make sure we clean up mft_handle.Free(); } } finally { // make sure we clean up handle.Free(); } } } Pause(); } private static void Pause() { Console.WriteLine("Press enter to continue..."); Console.ReadLine(); } } public static class Dumper { public static string DumpToHtmlString<T>(this T objectToSerialize) { string strHTML = ""; try { var writer = LINQPad.Util.CreateXhtmlWriter(true); writer.Write(objectToSerialize); strHTML = writer.ToString(); } catch (Exception exc) { Debug.Assert(false, "Investigate why ?" + exc); } var shower = new Thread( () => { var dumpWin = new Window(); var browser = new WebBrowser(); dumpWin.Content = browser; browser.NavigateToString(strHTML); dumpWin.ShowDialog(); }); shower.SetApartmentState(ApartmentState.STA); shower.Start(); return strHTML; } public static string Dump(this object value) { return JsonConvert.SerializeObject(value, Formatting.Indented); } }
罗伯特,我不认为你想要实现的是真正有可能做的,而不是主动操作文件系统的文件系统,从它的声音,被挂载。 我不认为我必须告诉你这种做法有多危险和不明智 。
但是,如果你需要这样做,我想我可以给你一个“餐巾背面的素描”,让你开始:
您可以利用NTFS的“稀疏文件”支持来简单地通过调整LCN / VCN映射来添加“间隙”。 一旦你这样做,只要打开文件,寻求新的位置,并写下你的数据。 NTFS将透明地分配空间,并将数据写入文件的中间,在那里创建了一个洞。
有关更多信息,请参阅关于NTFS中的碎片整理支持的此页面,以获取有关如何操纵一些内容并允许您在文件中间插入群集的提示。 至少通过使用经过处理的API来处理这类事情,我想你不可能将文件系统损坏到无法修复的程度,尽管你仍然可能会损坏你的文件。
获取所需文件的检索指针,将它们拆分到需要的位置,根据需要添加更多的额外空间,然后移动文件。 Russinovich / Ionescu的“Windows Internals”一书中有这样一个有趣的章节( http://www.amazon.com/Windows%C2%AE-Internals-Including-Windows-Developer/dp/0735625301 )
抽象的问题,抽象的答案:
当然,在FAT和其他大多数FS中可能会这样做,但是实质上是将文件碎片化,而不是更常见的碎片整理过程。
FAT是由大约簇指针组成的,这些指针产生一串簇号,数据存储在其中,第一个链接索引与文件记录一起存储,第二个链接索引存储在索引[第一个链接号]的分配表中。只要您插入的数据在群集的边界处结束,就可以在链中的任何位置插入另一个链接。
有机会,通过查找一个开源库,你可以更容易的在C中做这个事情。 虽然在C#中使用PInvoke可能会这样做,但是您不会发现有任何漂亮的示例代码可供您开始使用。
我怀疑你没有任何控制文件格式(视频文件?),如果你这样做会更容易设计你的数据存储,以避免这个问题摆在首位。
不,你问的是在Windows中不可能的。
这是因为在Windows中,文件是逻辑上连续的字节集合,并且不可能在不覆盖文件的情况下将字节插入到文件的中间。
要理解为什么,让我们进行一个思考实验,如果可能的话,它会是什么意思。
首先,内存映射文件会突然变得更加复杂。 如果我们已经在一个特定的地址映射了一个文件,然后在它的中间增加了一些额外的字节,这对内存映射意味着什么呢? 内存映射现在应该突然移动了吗? 如果是这样,程序会发生什么,不期望它?
其次,让我们考虑一下如果两个句柄对同一个文件是开放的,GetFilePointer会发生什么情况,并且在这个文件的中间插入额外的字节。 假设进程A已经打开了文件,进程B已经打开进行读写。
进程A想要保存它的位置,同时做一些读取,所以它写了一些代码有点像
DWORD DoAndThenRewind(HANDLE hFile, FARPROC fp){ DWORD result; LARGEINTEGER zero = { 0 }; LARGEINTEGER li; SetFilePointer(hFile, zero, &li, FILE_CURRENT); result = fp(); SetFilePointer(hFile, &li, &li, FILE_BEGIN); return result; }
现在如果进程B想要在文件中插入一些额外的字节,这个函数会发生什么? 那么,如果我们在进程A当前所在的地方添加字节,那么一切都很好 – 文件指针(它是文件开头的线性地址)在前后保持不变,一切正常。
但是,如果我们在进程A 之前添加额外的字节,那么突然我们捕获的文件指针都是错位的,并且坏事情开始发生。
换句话说,将字节添加到文件的中间意味着我们突然需要发明更多巧妙的方式来描述我们在文件中的位置以便进行倒带,因为文件不再是逻辑上连续的字节选择。
所以到现在为止我们已经讨论了为什么Windows公开这种功能可能是一个坏主意; 但这并不能真正回答“这是否真的有可能”的问题。 这里的答案仍然没有。 这不可能。
为什么? 因为没有这样的功能暴露给用户模式程序来做到这一点。 作为一个用户模式的程序,你有一个获取文件句柄的机制(NtCreateFile / NtOpenFile),你可以通过NtReadFile / NtWriteFile对它进行读写,你可以通过NtSetFileInformation来找到它并重命名并删除它,可以通过NtClose释放句柄引用。
即使从内核模式,你也没有更多的选择。 文件系统API是从你那里抽象出来的,文件系统将文件视为逻辑上连续的字节集合,而不是作为字节范围的链接列表,或者任何能够让你方便地公开方法的插入非覆盖字节一份文件。
这并不是说它本身不可能。 正如其他人所提到的那样,您可以自己打开磁盘,伪装成NTFS,并直接更改分配给特定FCB的磁盘簇。 但是这样做是很勇敢的。 NTFS几乎没有记录,很复杂,可能会有变化,即使没有被操作系统安装也很难修改,不必介意。
所以答案,恐怕是不行 。 这是不可能通过正常的安全Windows机制添加额外的字节到文件的中间作为插入而不是覆盖操作。
相反,请考虑查看您的问题,看看是否适合将文件分成较小的文件和索引文件。 这样你就可以修改索引文件来插入额外的块。 通过打破您对依赖于需要驻留在一个文件中的数据的依赖,您会发现避免文件系统要求文件在逻辑上连续收集字节是比较容易的。 然后,您可以修改索引文件,将额外的块添加到“pseduofile”中,而无需将整个伪文件读入内存。
您不需要(也可能不能)修改文件访问表。 您可以使用过滤器驱动程序或可堆叠的FS实现相同的功能。 让我们考虑4K的群集大小。 我只是为了我在最后解释的理由而写出这个设计。
创建一个新文件将在头文件的布局图。 标题将提及条目数量和条目列表。 标题的大小将与群集的大小相同。 为了简单起见,首标必须是具有4K条目的固定大小。 例如,假设有一个20KB的文件可能会提到:[DWORD:5] [DWORD:1] [DWORD:2] [DWORD:3] [DWORD:4] [DWORD:5]。 这个文件目前没有插入。
假设某人在扇区3后面插入了一个簇。可以将其添加到文件末尾,并将布局映射更改为:[5] [1] [2] [3] [5] [6] [4]
假设有人需要寻求聚类4.您将需要访问布局图并计算偏移量,然后寻求它。 这将是前5个集群后,所以将在16K开始。
假设某人连续读取或写入文件。 读取和写入必须以相同的方式映射。
假设头只剩下一个条目:我们需要通过在文件尾部使用与上面其他指针相同的格式的指针来扩展它。 要知道我们有多个集群,我们需要做的就是查看项目的数量并计算存储它所需的集群数量。
您可以使用Windows上的筛选器驱动程序或Linux上的可堆叠文件系统(LKM)来实现上述所有功能。 实现功能的基本层次是困难的毕业迷你项目的水平。 将其作为商业文件系统使用可能会非常具有挑战性,特别是因为您不想影响IO速度。
Note that the above filter will not be affected by any change in disk layout / defragmentation etc. You can also defragment your own file if you think it will be helpful.
Do you understand that it's nearly 99.99% impossible insert non-aligned data in non-aligned places? (Maybe some hack based on compression can be used.) I think that you do.
The "easiest" solution is to create the sparse run records and then write over the sparse ranges.
It all really depends on what the original problem is, that is what you're trying to achieve. Modification of a FAT / NTFS table is not the problem, it's a solution to your problem — potentially elegant and efficient, but more likely highly dangerous and inappropriate. You mentioned that you have no control over the users' systems where it will be used, so presumably for at least some of them the administrator would object against hacking into the file system internals.
Anyways, let's get back to the problem. Given the incomplete information, several use cases may be imagined, and the solution will be either easy or difficult depending on the use case.
If you know that after the edit the file won't be needed for some time, then saving the edit in half a second is easy — just close the window and let the application finish saving in the background, even if it takes half an hour. I know this sounds dumb, but this is a frequent use case — once you finish editing your file, you save it, close the program, and you don't need that file anymore for a long time.
Unless you do. Maybe the user decides to edit some more, or maybe another user comes along. In both cases your application can easily detect that the file is in the process of being saved to hard disk (for example you may have around a hidden guard file while the main file is being saved). In this case you would open a file as-is (partially saved), but present to the user the customized view of the file which makes it appear as if the file is in the final state. After all, you have all the information about which chunks of file have to be moved where.
Unless the user needs to open the file immediately in another editor (this is not a very common case, especially for a very specialized file format, but then who knows). If so, do you have access to the source code of that other editor? Or can you talk to the developers of that other editor and persuade them to treat the incompletely saved file as if it was in the final state (it's not that hard — all it takes is to read the offset information from the guard file). I would imagine the developers of that other editor are equally frustrated with long save times and would gladly embrace your solution as it would help their product.
What else could we have? Maybe the user wants to immediately copy or move the file somewhere else. Microsoft probably won't change Windows Explorer for your benefit. In that case you would either need to implement the UMDF driver, or plainly forbid the user to do so (for example rename the original file and hide it, leaving a blank placeholder in its place; when the user tries to copy the file at least he'll know something went wrong).
Another possibility, which doesn't fit in the above hierarchy 1-4 nicely, comes up if you know beforehand which files will be edited. In that case you can "pre-sparse" the file inserting random gaps uniformly along the volume of the file. This is due to the special nature of your file format that you mentioned: there could be gaps of no data, provided that the links correctly point to following next data chunks. If you know which files will be edited (not unreasonable assumption — how many 10Gb files lie around your hard drive?) you "inflate" the file before the user starts editing it (say, the night before), and then just move around these smaller chunks of data when you need to insert new data. This of course also relies on the assumption that you don't have to insert TOO much.
In any case, there's always more than one answer depending on what your users actually want. But my advice comes from a designer's perspective, not from programmer's.
Edited – another approach – how about switching to Mac for this task? They have superior editing capabilities, with automation capabilities!
Edited – the original specs suggested the file was being modified a lot, instead it is modified once. Suggest as others have pointed out to do the operation in the background: copy to new file, delete old file, rename new file to old file.
I would abandon this approach. A database is what you're looking for./YR