Equivalent to /*.txt of bash in python [duplicate] - python

This question already has an answer here:
Find all files in a directory with extension .txt in Python
31 answers
In bash there is ${files_path}/*.txt that takes all the .txt files in the specific path. Is there an equivalent script in python?

glob is exactly what you're looking for.
The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order. No tilde expansion is done, but *, ?, and character ranges expressed with [] will be correctly matched.
Using glob module:
>>> import os, glob
>>> os.listdir('.')
['package', 'test.py', 'test.pyc', 'test2.py', 'test2.pyc']
>>> glob.glob('*.pyc')
['test.pyc', 'test2.pyc']
>>>

Import os
List all files in the directory
Filter all files that end with "txt"
So something like this:
import os
txt_files = list(filter(lambda x: x.endswith(".txt"), os.listdir(<yourpathhere>)))

If you use the package os
import os
PATH = "GIVE PATH HERE"
print os.listdir(PATH)

Related

Python: How can i find a directory that matches the first 3 characters from a string?

I have a directory with 50+ directories inside which are named "XXX - something"
If I have X = '123'
How can I find the directory that starts with '123'?
You can try this using os.walk
import os
[i[0] for i in os.walk('/path/to/directory/') if i[0].split("/")[-1].startswith(X)]
It will return a list in the folder /path/to/directory/ recursively, if foldername startswith X (your varibale)
OR
[i for i,j,k in os.walk('/path/to/directory/') if i.split("/")[-1].startswith(X)]
import os, sys
folder = sys.argv[1]
folders = "ls -lh %s*" %(folder)
os.system(folders)
run the code by folderseach.py 123
Where 123 represents the folder you'd like to find

getting the absolute path of a file inside several directories and subdirectories

I am supposed to get the absolute path of a file which is present in dir3.
the path is
"C:\\Workspace\\folder1\\folder2\\file"
And the only input I am supposed to provide is the name of the file and the name of the major directory in C drive i.e. Workspace.
Can I get the absolute path using any inbuilt function in python. I tried using this code but it gave me erroneous results:
import os
x='workspace'
y='file_name'
path_1=os.path.abspath("workspace/file_name")
print(path_1)
output:
C:\Workspace\workspace\file_name
I think it should work with
os.path.abspath("workspace/file_name")
Edit:
I tried in python console a few secs ago:
import os
os.path.abspath("Bachelor/simpleOpenCL.py")
'/home/julius/Bachelor/simpleOpenCL.py'
Take a look at this Active State Recipe.
There's function definition to do recursive file searchs
import os, fnmatch
def locate(pattern, root=os.curdir):
'''Locate all files matching supplied filename pattern in and below
supplied root directory.'''
for path, dirs, files in os.walk(os.path.abspath(root)):
for filename in fnmatch.filter(files, pattern):
yield os.path.join(path, filename)
Example use of the function:
for x in locate("*.zip", "C:\\Temp"):
print x
The builtin doesn't work like that BUT you can pretty handily make your own function using some methods from os
Assuming python 3.0
os.walk
Assuming python 2.7
os.path.walk
Basically, you can split the path you are given using os.path.split() then us the walk method with the head and check if the tail is in the result. If you find it you can do os.path.abspath on the tail to get the abspath of that file
def locate(headnname):
abspaths = []
head,tail = os.path.split(headnname)
if not os.path.isdir(head):
raise IOError("not a valid head: %s" % head)
for dp,dn,fn in os.walk(head):
if tail in fn:
abspaths.append(dp+"/"+tail)
return abspaths
output:
>>> locate("D:/users/admin/pytools.py")
['D:/users/admin\\Programs\\AT_Plotter\\src/pytools.py',
'D:/users/admin\\Programs\\py2exe/pytools.py',
'D:/users/admin\\Programs\\pytools\\src/pytools.py',
'D:/users/admin\\Shared\\pyIO/pytools.py',
'D:/users/admin\\Shared\\pyIO\\Old/pytools.py']

List files in a directory having more than one space

I have this code:
for f in os.listdir(ftpUploaddir):
if os.path.isfile(os.path.join(ftpUploaddir,f)):
#Filter files having .png as extension
if f[-4:] == ".png":
print "from directory", f
It does not list the files having nore than one space, e.g:
100002044_A h_HD_XXX_20120229_141236.png
There are 3 spaces between A and h.
I know single space will be listed but not multiple spaces
Even ls will not list the files using ls/*.png Any help appreciated
Try doing this to see if the file is really there -- perhaps (as Sven suggested) there's a space or some other character after the ".png"?
for f in os.listdir(ftpUploaddir):
if "h_HD_XXX_20120229_141236" in f
print "Full name is %r" % f
if not os.path.isfile(os.path.join(ftpUploaddir,f)):
print " (but it's not a file?)"
I can't reproduce this problem. Try running this Python script:
# create a file with multiple spaces in the name
outf = open("100002044_A h_HD_XXX_20120229_141236.png", "w")
outf.write("hello, world")
outf.close()
# see if os.listdir can find it
import os
print "100002044_A h_HD_XXX_20120229_141236.png" in os.listdir(os.getcwd())
For me, it's always printing True.
I tried using fnmatch module but can't reproduce the problem.
>>> import os
>>> import fnmatch
>>> os.listdir(r'C:\Users\RanRag\python\test')
['gameicon.png', 'grass i test.png', 'hello.txt']
>>> for file in os.listdir(r'C:\Users\RanRag\python\test'):
... if fnmatch.fnmatch(file , '*.png'):
... print file
...
gameicon.png
grass i test.png
>>>
Your ls command should be: ls *.png. If you really used a slash, it's no surprise it's not working.
I'd check if your ftpUploaddir is correct: Is your script finding any files when you run it? With the right path, your script should work as written.
Incidentally, it's easier to find files with a particular extension like this:
import glob, os
for f in glob.glob(os.path.join(ftpUploaddir, "*.png")):
print f

Search for a file using a wildcard

I want get a list of filenames with a search pattern with a wildcard. Like:
getFilenames.py c:\PathToFolder\*
getFilenames.py c:\PathToFolder\FileType*.txt
getFilenames.py c:\PathToFolder\FileTypeA.txt
How can I do this?
Like this:
>>> import glob
>>> glob.glob('./[0-9].*')
['./1.gif', './2.txt']
>>> glob.glob('*.gif')
['1.gif', 'card.gif']
>>> glob.glob('?.gif')
['1.gif']
This comes straight from here: http://docs.python.org/library/glob.html
glob is useful if you are doing this in within python, however, your shell may not be passing in the * (I'm not familiar with the windows shell).
For example, when I do the following:
import sys
print sys.argv
On my shell, I type:
$ python test.py *.jpg
I get this:
['test.py', 'test.jpg', 'wasp.jpg']
Notice that argv does not contain "*.jpg"
The important lesson here is that most shells will expand the asterisk at the shell, before it is passed to your application.
In this case, to get the list of files, I would just do sys.argv[1:]. Alternatively, you could escape the *, so that python sees the literal *. Then, you can use the glob module.
$ getFileNames.py "*.jpg"
or
$ getFileNames.py \*.jpg
from glob import glob
import sys
files = glob(sys.argv[1])
I am adding this to the previous because I found this very useful when you want your scripts to work on multiple shell and with multiple parameters using *.
If you want something that works on every shells, you can do the following (still using glob):
>>> import glob
>>> from functools import reduce # if using python 3+
>>> reduce(lambda r, x: r + glob.glob(x), sys.argv[1:], [])
Note that it can produce duplicate (if you have a test file and you give t* and te*), but you can simply remove them using a set:
>>> set(reduce(lambda r, x: r + glob.glob(x), sys.argv[1:], []))

Getting a list of all subdirectories in the current directory

Is there a way to return a list of all the subdirectories in the current directory in Python?
I know you can do this with files, but I need to get the list of directories instead.
Do you mean immediate subdirectories, or every directory right down the tree?
Either way, you could use os.walk to do this:
os.walk(directory)
will yield a tuple for each subdirectory. Ths first entry in the 3-tuple is a directory name, so
[x[0] for x in os.walk(directory)]
should give you all of the subdirectories, recursively.
Note that the second entry in the tuple is the list of child directories of the entry in the first position, so you could use this instead, but it's not likely to save you much.
However, you could use it just to give you the immediate child directories:
next(os.walk('.'))[1]
Or see the other solutions already posted, using os.listdir and os.path.isdir, including those at "How to get all of the immediate subdirectories in Python".
import os
d = '.'
[os.path.join(d, o) for o in os.listdir(d)
if os.path.isdir(os.path.join(d,o))]
You could just use glob.glob
from glob import glob
glob("/path/to/directory/*/")
Don't forget the trailing / after the *.
If you need a recursive solution that will find all the subdirectories in the subdirectories, use walk as proposed before.
If you only need the current directory's child directories, combine os.listdir with os.path.isdir
I prefer using filter (https://docs.python.org/2/library/functions.html#filter), but this is just a matter of taste.
d='.'
filter(lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d))
Much nicer than the above, because you don't need several os.path.join() and you will get the full path directly (if you wish), you can do this in Python 3.5+
subfolders = [f.path for f in os.scandir(folder) if f.is_dir() ]
This will give the complete path to the subdirectory.
If you only want the name of the subdirectory use f.name instead of f.path
https://docs.python.org/3/library/os.html#os.scandir
Implemented this using python-os-walk. (http://www.pythonforbeginners.com/code-snippets-source-code/python-os-walk/)
import os
print("root prints out directories only from what you specified")
print("dirs prints out sub-directories from root")
print("files prints out all files from root and directories")
print("*" * 20)
for root, dirs, files in os.walk("/var/log"):
print(root)
print(dirs)
print(files)
You can get the list of subdirectories (and files) in Python 2.7 using os.listdir(path)
import os
os.listdir(path) # list of subdirectories and files
Thanks for the tips, guys. I ran into an issue with softlinks (infinite recursion) being returned as dirs. Softlinks? We don't want no stinkin' soft links! So...
This rendered just the dirs, not softlinks:
>>> import os
>>> inf = os.walk('.')
>>> [x[0] for x in inf]
['.', './iamadir']
Since I stumbled upon this problem using Python 3.4 and Windows UNC paths, here's a variant for this environment:
from pathlib import WindowsPath
def SubDirPath (d):
return [f for f in d.iterdir() if f.is_dir()]
subdirs = SubDirPath(WindowsPath(r'\\file01.acme.local\home$'))
print(subdirs)
Pathlib is new in Python 3.4 and makes working with paths under different OSes much easier:
https://docs.python.org/3.4/library/pathlib.html
Building upon Eli Bendersky's solution, use the following example:
import os
test_directory = <your_directory>
for child in os.listdir(test_directory):
test_path = os.path.join(test_directory, child)
if os.path.isdir(test_path):
print test_path
# Do stuff to the directory "test_path"
where <your_directory> is the path to the directory you want to traverse.
Here are a couple of simple functions based on #Blair Conrad's example -
import os
def get_subdirs(dir):
"Get a list of immediate subdirectories"
return next(os.walk(dir))[1]
def get_subfiles(dir):
"Get a list of immediate subfiles"
return next(os.walk(dir))[2]
With full path and accounting for path being '.', '..', '\', '..\..\subfolder', etc
import os, pprint
pprint.pprint([os.path.join(os.path.abspath(path), x[0]) for x in os.walk(os.path.abspath(path))])
Although this question is answered a long time ago. I want to recommend to use the pathlib module since this is a robust way to work on Windows and Unix OS.
So to get all paths in a specific directory including subdirectories:
from pathlib import Path
paths = list(Path('myhomefolder', 'folder').glob('**/*.txt'))
# all sorts of operations
file = paths[0]
file.name
file.stem
file.parent
file.suffix
etc.
Listing Out only directories
print("\nWe are listing out only the directories in current directory -")
directories_in_curdir = filter(os.path.isdir, os.listdir(os.curdir))
print(directories_in_curdir)
Listing Out only files in current directory
files = filter(os.path.isfile, os.listdir(os.curdir))
print("\nThe following are the list of all files in the current directory -")
print(files)
This answer didn't seem to exist already.
directories = [ x for x in os.listdir('.') if os.path.isdir(x) ]
Python 3.4 introduced the pathlib module into the standard library, which provides an object oriented approach to handle filesystem paths:
from pathlib import Path
p = Path('./')
# List comprehension
[f for f in p.iterdir() if f.is_dir()]
# The trailing slash to glob indicated directories
# This will also include the current directory '.'
list(p.glob('**/'))
Pathlib is also available on Python 2.7 via the pathlib2 module on PyPi.
I've had a similar question recently, and I found out that the best answer for python 3.6 (as user havlock added) is to use os.scandir. Since it seems there is no solution using it, I'll add my own. First, a non-recursive solution that lists only the subdirectories directly under the root directory.
def get_dirlist(rootdir):
dirlist = []
with os.scandir(rootdir) as rit:
for entry in rit:
if not entry.name.startswith('.') and entry.is_dir():
dirlist.append(entry.path)
dirlist.sort() # Optional, in case you want sorted directory names
return dirlist
The recursive version would look like this:
def get_dirlist(rootdir):
dirlist = []
with os.scandir(rootdir) as rit:
for entry in rit:
if not entry.name.startswith('.') and entry.is_dir():
dirlist.append(entry.path)
dirlist += get_dirlist(entry.path)
dirlist.sort() # Optional, in case you want sorted directory names
return dirlist
keep in mind that entry.path wields the absolute path to the subdirectory. In case you only need the folder name, you can use entry.name instead. Refer to os.DirEntry for additional details about the entry object.
use a filter function os.path.isdir over os.listdir()
something like this filter(os.path.isdir,[os.path.join(os.path.abspath('PATH'),p) for p in os.listdir('PATH/')])
If you want just the top list folder, please use listdir as walk take too much time.

Resources