python - Can a unique list accept lower and uppercase entries -
my script logs information unique file types in directory , subdirectory. in process of creating unique list of file extensions current code considers jpg, jpg , jpg same includes 1 of them in list. how can include 3 or more variances?
for root, dirs, files in os.walk(sourcedir, topdown=false): fl in files: currentfile=os.path.join(root, fl) ext=fl[fl.rfind('.')+1:] if ext!='': if dirlimiter in currentfile: list.append(currentfile) directory1=os.path.basename(os.path.normpath(currentfile[:currentfile.rfind(dirlimiter)])) directory2=(currentfile[len(sourcedir):currentfile.rfind('\\'+directory1+dirlimiter)]) directory=directory2+'\\'+directory1 if directory not in dirlist: dircount+=1 dirlist.append(directory) if ext not in extlist: extlist.append(ext)
the full script in question on stackexchange: recurse through directories , log files file type in python
thanks jennak on further investigation found input in jpg report had jpg , jpg in file below.
> 44;x:\scratch\project\input\foreshore , jetties package > 3\487679 - jetty\img_1630.jpg;3755267 > 45;x:\scratch\project\input\foreshore , jetties package > 3\487679 - jetty\img_1633.jpg;2447135 > 1;x:\scratch\project\input\649701 - hill > close\2263.jpg;405328 2;x:\scratch\project\input\649701 - hill close\2264.jpg;372770
so first got details of jpg files jpg files , put them in single report more convenient having 2 reports. guess programmed better thought :-)
no, list
, in
operator checks equality, , strings equal 1 when use same case.
you use set here, , store directory.lower()
values in it. sets (a lot) faster membership testing lists well:
directories = set() extensions = set() root, dirs, files in os.walk(sourcedir, topdown=false): # ... # no need use `directory.lower() in directories`, update set: directories.add(directory.lower()) # ... extensions.add(ext.lower())
the dircount
variable derived later on:
dircount = len(directories)
you want functions provided os.path more, in particular os.path.splitext()
, os.path.relpath()
, os.path.join()
functions.
your file handling in loop can simplified lot; a:
for fl in files: filename = os.path.join(root, fl) base, ext = os.path.splitext(filename) if ext: list.append(filename) directory = os.path.relpath(filename, sourcedir) directories.add(directory.lower()) extensions.add(ext)
note use just os.path.relpath()
here; os.path.basename()
, os.path.normpath()
dance plus delimiters, etc. needlessly complicated.
now, reading between lines, seems want consider extensions equal whatever case of just part.
in case, build new filename result of os.path.splitext()
:
base, ext = os.path.splitext(filename) normalized_filename = base + ext.lower()
now normalized_filename
filename extension lowered, can use value in sets needed.
Comments
Post a Comment