2010/07/02

Analyse Which Files Take the Most Space

I often need to either clean up space on a drive or to shrink a project / notes for archiving. One way to do it is to navigate blindly and to erase big files when found. But this process can be cumbersome. I wrote a small Python script that lists the 150 biggest files contained under the folder in which the script is executed.




#python 2.6, 3.1

import os, tempfile
import misc.search_files

class FileInfo:
  
  __FullPathFilename = None
  __FileSize = None
  
  def __init__(self, FullPathFilename_, FileSize_):
    self.__FullPathFilename = FullPathFilename_
    self.__FileSize = FileSize_
  
  def __lt__(self, other):
    return (self.__FileSize < other.__FileSize)
  
  def ToRow_NameAndSize(self, SepareAt = 100):
    FullPathStr = self.GetFilename()
    if len(FullPathStr) > SepareAt:
      FullPathStr = FullPathStr[len(FullPathStr)-SepareAt:]
    while len(FullPathStr) < SepareAt:
      FullPathStr+= " "
    return FullPathStr + " : " + str(self.GetFileSize()/1024) + 'k'

  def GetFilename(self):
    return self.__FullPathFilename
  
  def GetFileSize(self):
    return self.__FileSize

def GetBiggestFileList(SepareAt = 110, MaxNumFilesInReport = 150):
  lAllFilesListIncludingSubDirs = misc.search_files.getAllFilesRecursively(['*.*'], '.')
  
  TotalDiskSpace = 0
  AllFileInfo = []
  
  for file in lAllFilesListIncludingSubDirs:
    try:
      lFileSize = os.path.getsize(file)
      TotalDiskSpace += lFileSize
      AllFileInfo.append( FileInfo(file, lFileSize) )
    #@tag Give a better output of why the file exception occurred
    #except Exception as inst:
    except Exception:
      print ("Error! " + file)
  
  AllFileInfo.sort()
  
  if( len(AllFileInfo)>MaxNumFilesInReport ):
    subAllFileInfo = AllFileInfo[-MaxNumFilesInReport:]
  else:
    subAllFileInfo = AllFileInfo
  
  Report = ''
  Report+= 'Total size: ' + str(TotalDiskSpace/1024) + "k\n"
  for lFileInfo in subAllFileInfo:
    Report += str(lFileInfo.ToRow_NameAndSize(SepareAt)) + "\n"
  
  return Report

if __name__ == '__main__':
  print (GetBiggestFileList())




For example, here is the result of launching the command in c:\windows to find out which files take the most space in the OS.



python get_space_hoggers_report.py | tee report.txt


And here is the result:

Total size: 18528928k
orms\9c6fe9d44d22834993e9aa23cc9dc272\System.Windows.Forms.ni.dll : 12139k
31bf3856ad364e35_6.0.6001.18000_none_c0a3fbb5ef29fe27\Mahjong.dll : 12261k
31bf3856ad364e35_6.0.6002.18005_none_c28f74c1ec4bc973\Mahjong.dll : 12261k
orms\17e020ae92d7fab33bcc1c98b25019d0\System.Windows.Forms.ni.dll : 12701k
Entity\642a7b3d47828fb0070a55cfeb58f42b\System.Data.Entity.ni.dll : 12962k
load\41bec7591f57a2b41248a2c1d4189ab0\Windows6.0-KB944036-x86.cab : 13073k
c:\Windows\Fonts\gulim.ttc                                        : 13207k
6ad364e35_6.0.6000.16386_none_4355a8715fa423d5_gulim.ttc_7c526737 : 13207k
m_31bf3856ad364e35_6.0.6000.16386_none_4355a8715fa423d5\gulim.ttc : 13207k
s\System32\DriverStore\FileRepository\nvdj.inf_d1096b58\nvcpl.dll : 13234k
s\System32\DriverStore\FileRepository\nvdj.inf_e166b159\nvcpl.dll : 13234k
s\System32\DriverStore\FileRepository\nvdj.inf_f4eaea07\nvcpl.dll : 13234k
a_31bf3856ad364e35_6.0.6001.18000_none_03ed68ae2c4994ef\dicjp.bin : 13259k
c:\Windows\System32\xlivefnt.dll                                  : 13322k
c:\Windows\System32\nvcpl.dll                                     : 13363k
c:\Windows\Fonts\simsun.ttc                                       : 13424k
ad364e35_6.0.6000.16386_none_f8d25d0e72c3c090_simsun.ttc_eba56c14 : 13424k
_31bf3856ad364e35_6.0.6000.16386_none_f8d25d0e72c3c090\simsun.ttc : 13424k
d_31bf3856ad364e35_6.0.6000.16386_none_770bd33f8d44346e\ehcir.ird : 13575k
ache$\Managed\00002109030000000000000000F01FEC\12.0.4518\OART.DLL : 13819k
c:\Windows\System32\xlive.dll                                     : 13976k
wo#\b89f584d5b315c16d4e57e747158cb69\PresentationFramework.ni.dll : 13992k
wo#\0832f9155d800cb802e70409447c1128\PresentationFramework.ni.dll : 13993k
0319_32\mscorlib\246f1a5abb686b9dcdf22d3505b08cea\mscorlib.ni.dll : 14078k
c:\Windows\Fonts\msjhbd.ttf                                       : 14169k
ad364e35_6.0.6000.16386_none_5c79d760afbbb312_msjhbd.ttf_176cee86 : 14169k
_31bf3856ad364e35_6.0.6000.16386_none_5c79d760afbbb312\msjhbd.ttf : 14169k
c:\Windows\Logs\CBS\CBS.log                                       : 14280k
e$\Managed\00002109030000000000000000F01FEC\12.0.4518\XL12CNV.EXE : 14330k
_31bf3856ad364e35_6.0.6000.16386_none_0c8ed16bb707d3be\msyhbd.ttf : 14341k
c:\Windows\Fonts\msyhbd.ttf                                       : 14343k
ad364e35_6.0.6002.18005_none_10b10c73b114afde_msyhbd.ttf_16e5cd4d : 14343k
_31bf3856ad364e35_6.0.6002.18005_none_10b10c73b114afde\msyhbd.ttf : 14343k
c:\Windows\Fonts\msjh.ttf                                         : 14368k
56ad364e35_6.0.6000.16386_none_6309f686e329e15f_msjh.ttf_ea675e5c : 14368k
ei_31bf3856ad364e35_6.0.6000.16386_none_6309f686e329e15f\msjh.ttf : 14368k
c:\Windows\Fonts\msyh.ttf                                         : 14691k
ei_31bf3856ad364e35_6.0.6000.16386_none_389c8034332e39c5\msyh.ttf : 14691k
c:\Windows\IME\IMEJP10\DICTS\IMJPST.DIC                           : 14726k
_31bf3856ad364e35_6.0.6000.16386_none_7e4e5681ddf0010b\IMJPST.DIC : 14726k
c:\Windows\Installer\1aac20a.msp                                  : 14834k
c:\Windows\System32\nvoglv32.dll                                  : 14878k
ystem32\DriverStore\FileRepository\nvdj.inf_59384ced\nvoglv32.dll : 14878k
c:\Windows\Fonts\simsunb.ttf                                      : 15045k
d364e35_6.0.6000.16386_none_8ec3c7fa1f04c342_simsunb.ttf_08f71e3f : 15045k
31bf3856ad364e35_6.0.6000.16386_none_8ec3c7fa1f04c342\simsunb.ttf : 15045k
c:\Windows\Installer\24fce6d.msp                                  : 15342k
c:\Windows\IME\IMETC10\DICTS\IMTCS.IMD                            : 15444k
y_31bf3856ad364e35_6.0.6000.16386_none_8c1c51f402c169d0\IMTCS.IMD : 15444k
c:\Windows\System32\imageres.dll                                  : 15450k
364e35_6.0.6000.16386_none_da86e136fafaf563_imageres.dll_44f44625 : 15450k
1bf3856ad364e35_6.0.6000.16386_none_da86e136fafaf563\imageres.dll : 15450k
1FEC\12.0.4518\msmdlocal.dll.5DF9D670_534C_4AB2_B0C6_FF0B0C448C29 : 15489k
93892-1000\65AE474ADBD51814280308A67426AEF7\6.2.7000\Combi.04.psi : 15611k
c:\Windows\Fonts\batang.ttc                                       : 15883k
ad364e35_6.0.6000.16386_none_b5b2ca1d695fce16_batang.ttc_949601ce : 15883k
_31bf3856ad364e35_6.0.6000.16386_none_b5b2ca1d695fce16\batang.ttc : 15883k
ndows\System32\spool\drivers\w32x86\PCC\prnhp001.inf_2ade4966.cab : 16103k
c:\Windows\ehome\ehcir.ird                                        : 16170k
d_31bf3856ad364e35_6.0.6000.16663_none_771e77eb8d36a7fc\ehcir.ird : 16170k
d_31bf3856ad364e35_6.0.6000.20804_none_77e9f66ea622b69e\ehcir.ird : 16170k
d_31bf3856ad364e35_6.0.6001.18043_none_791a56698a4d010b\ehcir.ird : 16170k
d_31bf3856ad364e35_6.0.6001.22147_none_79a7f45ca3670631\ehcir.ird : 16170k
d_31bf3856ad364e35_6.0.6002.18005_none_7b2e0e478751108e\ehcir.ird : 16170k
c:\Windows\Fonts\meiryo.ttc                                       : 16318k
ad364e35_6.0.6002.18130_none_76259f2c44aeed75_meiryo.ttc_ab0401d6 : 16318k
_31bf3856ad364e35_6.0.6000.16945_none_72531e3e4a65a4dd\meiryo.ttc : 16318k
_31bf3856ad364e35_6.0.6000.21148_none_72df94096380c3ee\meiryo.ttc : 16318k
_31bf3856ad364e35_6.0.6001.18349_none_743d5e0c47889b2a\meiryo.ttc : 16318k
_31bf3856ad364e35_6.0.6001.22550_none_74b32a3760b66ffd\meiryo.ttc : 16318k
_31bf3856ad364e35_6.0.6002.18130_none_76259f2c44aeed75\meiryo.ttc : 16318k
_31bf3856ad364e35_6.0.6002.22252_none_769b9cb35ddaf7cf\meiryo.ttc : 16318k
c:\Windows\System32\wbem\Logs\WMITracing.log                      : 16384k
c:\Windows\System32\config\COMPONENTS.SAV                         : 16452k
Cache$\Managed\00002109030000000000000000F01FEC\12.0.4518\MSO.DLL : 16475k
c:\Windows\Fonts\meiryob.ttc                                      : 16757k
d364e35_6.0.6002.18130_none_cf13a97974e4cf1c_meiryob.ttc_d9ebd964 : 16757k
31bf3856ad364e35_6.0.6000.16945_none_cb41288b7a9b8684\meiryob.ttc : 16757k
31bf3856ad364e35_6.0.6000.21148_none_cbcd9e5693b6a595\meiryob.ttc : 16757k
31bf3856ad364e35_6.0.6001.18349_none_cd2b685977be7cd1\meiryob.ttc : 16757k
31bf3856ad364e35_6.0.6001.22550_none_cda1348490ec51a4\meiryob.ttc : 16757k
31bf3856ad364e35_6.0.6002.18130_none_cf13a97974e4cf1c\meiryob.ttc : 16757k
31bf3856ad364e35_6.0.6002.22252_none_cf89a7008e10d976\meiryob.ttc : 16757k
000\65AE474ADBD51814280308A67426AEF7\6.2.7000\Le_Petit_Druide.psi : 16829k
Model\52cbaee4e94489731096be5ecc320958\System.ServiceModel.ni.dll : 16996k
che$\Managed\00002109030000000000000000F01FEC\12.0.4518\WWLIB.DLL : 17073k
wo#\7f91eecda3ff7ce478146b6458580c98\PresentationFramework.ni.dll : 17216k
che$\Managed\00002109030000000000000000F01FEC\12.0.4518\EXCEL.EXE : 17471k
Model\250b525aa8c17327216e102569c0d766\System.ServiceModel.ni.dll : 17499k
c:\Windows\Installer\fe5e2c.msi                                   : 17755k
c:\Windows\System32\WDI\LogFiles\BootCKCL.etl                     : 17792k
c:\Windows\System32\IME\IMETC10\applets\MSHWCHTR.dll              : 19522k
1bf3856ad364e35_6.0.6001.18000_none_fb2914a7fb7f05d4\MSHWCHTR.dll : 19522k
1bf3856ad364e35_6.0.6002.18005_none_fd148db3f8a0d120\MSHWCHTR.dll : 19522k
1bf3856ad364e35_6.0.6001.18000_none_fd48368c658afbaa\mshwchtr.dll : 19522k
ache$\Managed\68AB67CA7DA73301B7449A0300000010\9.3.0\AcroRd32.dll : 19957k
_msige52\program files\Google\Google Earth\client\googleearth.exe : 20428k
c:\Windows\System32\winevt\Logs\Security.evtx                     : 20484k
c:\Windows\System32\winevt\Logs\System.evtx                       : 20484k
c:\Windows\Installer\1aac08e.msp                                  : 20889k
c:\Windows\System32\IME\IMEJP10\APPLETS\mshwjpnr.dll              : 20959k
1bf3856ad364e35_6.0.6000.16386_none_29bd61de3dbf60e5\mshwjpnr.dll : 20959k
1bf3856ad364e35_6.0.6001.18000_none_03ed68ae2c4994ef\mshwjpnr.dll : 20959k
c:\Windows\System32\IME\imekr8\applets\mshwkorr.dll               : 21316k
1bf3856ad364e35_6.0.6000.16386_none_4e1eb5b4af3fbd40\mshwkorr.dll : 21316k
1bf3856ad364e35_6.0.6001.18000_none_03ed2a082c4a1514\mshwkorr.dll : 21316k
1bf3856ad364e35_6.0.6001.18000_none_fd484d54658ae209\mshwchsr.dll : 21448k
c:\Windows\System32\wbem\repository\OBJECTS.DATA                  : 22528k
c:\Windows\Fonts\ARIALUNI.TTF                                     : 22730k
c:\Windows\Speech\Engines\SR\en-US\t1033.ngr                      : 22858k
_31bf3856ad364e35_6.0.6000.16386_en-us_cbfb04a3abf30016\t1033.ngr : 22858k
e52\program files\Google\Google Earth\plugin\googleearth_free.dll : 22880k
naged\00002109F10090400000000000F01FEC\12.0.4518\NLSDATA.DLL_1033 : 23818k
0002109030000000000000000F01FEC\12.0.4518\INSTALLED_RESOURCES.XSS : 24288k
c:\Windows\inf\setupapi.dev.log                                   : 24433k
c:\Windows\System32\config\RegBack\SYSTEM.OLD                     : 25552k
c:\Windows\Fonts\mingliu.ttc                                      : 26851k
31bf3856ad364e35_6.0.6000.16386_none_b8e3a7d58b1249ca\mingliu.ttc : 26851k
c:\Windows\Speech\Engines\SR\en-US\l1033.ngr                      : 27833k
_31bf3856ad364e35_6.0.6000.16386_en-us_cbfb04a3abf30016\l1033.ngr : 27833k
3856ad364e35_6.0.6001.18000_none_062b7e7afe71e492\PurblePlace.dll : 27994k
3856ad364e35_6.0.6002.18005_none_0816f786fb93afde\PurblePlace.dll : 27994k
s_31bf3856ad364e35_6.0.6001.18000_none_74d4a1cd7e673a2e\Chess.dll : 28321k
s_31bf3856ad364e35_6.0.6002.18005_none_76c01ad97b89057a\Chess.dll : 28321k
1bf3856ad364e35_6.0.6000.16386_none_0d44c2d7a6e22754\M1033DSK.CSD : 29099k
c:\Windows\System32\config\RegBack\COMPONENTS.OLD                 : 31148k
c:\Windows\System32\mrt.exe                                       : 31710k
c:\Windows\Fonts\mingliub.ttc                                     : 32999k
364e35_6.0.6000.16386_none_c6eae5a23b4a0d1e_mingliub.ttc_b8743970 : 32999k
1bf3856ad364e35_6.0.6000.16386_none_c6eae5a23b4a0d1e\mingliub.ttc : 32999k
32\DriverStore\FileRepository\nvdj.inf_05fd020f\NvCplSetupInt.exe : 37308k
93892-1000\65AE474ADBD51814280308A67426AEF7\6.2.7000\Combi.01.psi : 38261k
32\DriverStore\FileRepository\nvdj.inf_59384ced\NvCplSetupInt.exe : 39343k
c:\Windows\ehome\en-US\Intro.wmv                                  : 45166k
_31bf3856ad364e35_6.0.6000.16386_en-us_35933539ffce9bad\Intro.wmv : 45166k
c:\Windows\System32\config\RegBack\SOFTWARE.OLD                   : 45532k
5_6.0.6000.16386_none_3264f7ee9b82c6e1\Jewels of Caribbean.dvr-ms : 45830k
856ad364e35_6.0.6000.16386_none_3264f7ee9b82c6e1\Apollo 13.dvr-ms : 48902k
c:\Windows\Installer\4dac3f.msp                                   : 49713k
f3856ad364e35_6.0.6000.16386_none_3264f7ee9b82c6e1\Vertigo.dvr-ms : 51846k
c:\Windows\Speech\Engines\SR\en-GB\l2057.ngr                      : 55999k
_31bf3856ad364e35_6.0.6000.16386_en-gb_857893b11436ae5f\l2057.ngr : 55999k
c:\Windows\IME\IMESC5\DICTS\PINTLGT.IMD                           : 65408k
31bf3856ad364e35_6.0.6000.16386_none_b4aaff4041e28397\PINTLGT.IMD : 65408k
c:\Windows\Logs\CBS\CBS.persist.log                               : 68341k
c:\Windows\SoftwareDistribution\DataStore\DataStore.edb           : 77832k
c:\Windows\Installer\13ea212.msp                                  : 99335k
crosoft.NET\Framework\v4.0.30319\SetupCache\Client\netfx_core.mzz : 113164k
c:\Windows\winsxs\ManifestCache\6.0.6002.18005_001c11ba_blobs.bin : 188770k
c:\Windows\Installer\1aac1e6.msp                                  : 335018k

You can see that by clearing the last two files (which seem to just be cache files that were not deleted for whatever reason) one would free ~500Mb.

I bundled the Python script compiled as a win32 binary. It is for my own personal need when I am on a (windows) computer in a lab which does not have Python installed and I need to hunt down a few big files.

On a side note http://www.py2exe.org is an amazing tool that works quite well. It takes about 5 min to download / install / compile your script into a standalone windows application. For your reference here is the small script that uses py2exe:

#Launch with: python26 make_bin.py py2exeb
from distutils.core import setup
import py2exe

setup(
    console=['get_space_hoggers_report.py'],
    options={"py2exe":{"bundle_files":1}}    
    )

Happy hunting!

1 comment:

Anonymous said...

You could just download TreeSize Professional for FREE from CNET and it will search your whole computer.. just sayin.....