Tuesday, May 19, 2020

Using Glob with Directories in Ruby

Globbing files (with Dir.glob) in Ruby allows you to select just the files you want, such as all the XML files, in a given directory. Even though Dir.blog  is like  regular  expressions, it is not. Its very limited compared to Rubys regular expressions and is more closely related to shell expansion wildcards. The opposite of globbing, iterating over all the files in a directory, can be done with the Dir.foreach  method. Example The following glob will match all files ending in .rb in the current directory. It uses a single wildcard, the asterisk. The asterisk will match zero or more characters, so any file ending in .rb will match this glob, including a file called simply .rb, with nothing before the file extension and its preceding period. The glob method will return all files that match the globbing rules as an array, which can be saved for later use or iterated over. #!/usr/bin/env ruby Dir.glob(*.rb).each do|f| puts f end Wildcards and More There are only a few wildcards to learn: * – Match zero or more characters. A glob consisting of only the asterisk and no other characters or wildcards will match all files in the current directory. The asterisk is usually combined with a file extension if not more characters to narrow down the search.** – Match all directories recursively. This is used to descend into the directory tree and find all files in sub-directories of the current directory, rather than just files in the current directory. This wildcard is explored in the example code below.? – Match any one character. This is useful for finding files whose name are in a particular format. For example, 5 characters and a .xml extension could be expressed as .xml.[a-z] – Match any character in the character set. The set can be either a list of characters or a range separated with the hyphen character. Character sets follow the same syntax as and behave in the same manner as character sets in regular expressions.{a,b} – Match patter n a or b. Though this looks like a regular expression quantifier, it isnt. For example, in regular expression, the pattern a{1,2} will match 1 or 2 a characters. In globbing, it will match the string a1 or a2. Other patterns can be nested inside of this construct. One thing to consider is case sensitivity. Its up to the operating system to determine whether TEST.txt and TeSt.TxT refer to the same file. On Linux and other systems, these are different files. On Windows, these will refer to the same file. The operating system is also responsible for the order in which the results are displayed. It may differ if youre on Windows versus Linux, for example. One final thing to note is the Dir[globstring] convenience method. This is functionally the same as Dir.glob(globstring) and is also semantically correct (you are indexing a directory, much like an array). For this reason, you may see Dir[] more often than Dir.glob, but they are the same thing. Examples Using Wildcards The following example program will demonstrate as many patterns as it can in many different combinations. #!/usr/bin/env ruby # Get all .xml files Dir[*.xml] # Get all files with 5 characters and a .jpg extension Dir[.jpg] # Get all jpg, png and gif images Dir[*.{jpg,png,gif}] # Descend into the directory tree and get all jpg images # Note: this will also file jpg images in the current directory Dir[**/*.jpg] # Descend into all directories starting with Uni and find all # jpg images. # Note: this only descends down one directory Dir[Uni**/*.jpg] # Descend into all directories starting with Uni and all # subdirectories of directories starting with Uni and find # all .jpg images Dir[Uni**/**/*.jpg]

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.