解析FTP LIST命令返回的目录列表

简介:

2000元阿里云代金券免费领取,2核4G云服务器仅664元/3年,新老用户都有优惠,立即抢购>>>


阿里云采购季(云主机223元/3年)活动入口:请点击进入>>>,


阿里云学生服务器(9.5元/月)购买入口:请点击进入>>>,

有个程序要分析FTP服务器返回的目录列表,本来以为比较简单,也在网上查了几个帖子,可都是一知半解的。于是下载了Filezilla的源代码,她的源文件
directorylistingparser.h
directorylistingparser.cpp
就是解析目录列表的,有同样需求的不妨看一看,还是挺费事的,不同平台要都要特殊处理。我把头文件贴出来:


InBlock.gif#ifndef __DIRECTORYLISTINGPARSER_H__
InBlock.gif#define __DIRECTORYLISTINGPARSER_H__
InBlock.gif
/* This class is responsible for parsing the directory listings returned by
InBlock.gif * the server.
InBlock.gif * Unfortunatly, RFC959 did not specify the format of directory listings, so
InBlock.gif * each server uses its own format. In addition to that, in most cases the 
InBlock.gif * listings were not designed to be machine-parsable, they were meant to be
InBlock.gif * human readable by users of that particular server.
InBlock.gif * By far the most common format is the one returned by the Unix "ls -l"
InBlock.gif * command. However, legacy systems are still in place, especially in big
InBlock.gif * companies. These often use very exotic listing styles.
InBlock.gif * Another problem are localized listings containing date strings. In some
InBlock.gif * cases these listings are ambiguous and cannot be distinguished.
InBlock.gif * Example for an ambiguous date: 04-05-06. All of the 6 permutations for
InBlock.gif * the location of year, month and day are valid dates.
InBlock.gif * Some servers send multiline listings where a single entry can span two
InBlock.gif * lines, this has to be detected as well, as far as possible.
InBlock.gif *
InBlock.gif * Some servers send MVS style listings which can consist of just the 
InBlock.gif * filename without any additional data. In order to prevent problems, this 
InBlock.gif * format is only parsed if the server is in fact recognizes as MVS server.
InBlock.gif *
InBlock.gif * Please see tests/dirparsertest.cpp for a list of supported formats and the
InBlock.gif * expected parser result.
InBlock.gif *
InBlock.gif * If adding data to the parser, it first decomposes the raw data into lines,
InBlock.gif * which then are processed further. Each line gets consecutively tested for
InBlock.gif * different formats, starting with the most common Unix style format.
InBlock.gif * Lines not containing a recognized format (e.g. a part of a multiline
InBlock.gif * entry) are rememberd and if the next line cannot be parsed either, they
InBlock.gif * get concatenated to be parsed again (and discarded if not recognized).
InBlock.gif */

InBlock.gif
class CLine;
InBlock.gif class CToken;
InBlock.gif class CControlSocket;
InBlock.gif class CDirectoryListingParser
InBlock.gif{
InBlock.gif public:
InBlock.gif  CDirectoryListingParser(CControlSocket* pControlSocket,  const CServer& server);
InBlock.gif  ~CDirectoryListingParser();
InBlock.gif
  CDirectoryListing Parse( const CServerPath &path);
InBlock.gif
   void AddData( char *pData,  int len);
InBlock.gif   void AddLine( const wxChar* pLine);
InBlock.gif
   void Reset();
InBlock.gif
   void SetTimezoneOffset( const wxTimeSpan& span) { m_timezoneOffset = span; }
InBlock.gif
   void SetServer( const CServer& server) { m_server = server; };
InBlock.gif
protected:
InBlock.gif  CLine *GetLine( bool breakAtEnd =  false);
InBlock.gif
   void ParseData( bool partial);
InBlock.gif
   bool ParseLine(CLine *pLine,  const  enum ServerType serverType,  bool concatenated);
InBlock.gif
   bool ParseAsUnix(CLine *pLine, CDirentry &entry,  bool expect_date);
InBlock.gif   bool ParseAsDos(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsEplf(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsVms(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsIbm(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseOther(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsWfFtp(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsIBM_MVS(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsIBM_MVS_PDS(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsIBM_MVS_PDS2(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsIBM_MVS_Migrated(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsMlsd(CLine *pLine, CDirentry &entry);
InBlock.gif   bool ParseAsOS9(CLine *pLine, CDirentry &entry);
InBlock.gif  
InBlock.gif   // Only call this if servertype set to ZVM since it conflicts
InBlock.gif   // with other formats.
InBlock.gif   bool ParseAsZVM(CLine *pLine, CDirentry &entry);
InBlock.gif
   // Only call this if servertype set to HPNONSTOP since it conflicts
InBlock.gif   // with other formats.
InBlock.gif   bool ParseAsHPNonstop(CLine *pLine, CDirentry &entry);
InBlock.gif
   // Date / time parsers
InBlock.gif   bool ParseUnixDateTime(CLine *pLine,  int &index, CDirentry &entry);
InBlock.gif   bool ParseShortDate(CToken &token, CDirentry &entry,  bool saneFieldOrder =  false);
InBlock.gif   bool ParseTime(CToken &token, CDirentry &entry);
InBlock.gif
   // Parse file sizes given like this: 123.4M
InBlock.gif   bool ParseComplexFileSize(CToken& token, wxLongLong& size,  int blocksize = -1);
InBlock.gif
   bool GetMonthFromName( const wxString& name,  int &month);
InBlock.gif
  CControlSocket* m_pControlSocket;
InBlock.gif
   static std::map<wxString,  int> m_MonthNamesMap;
InBlock.gif  
InBlock.gif   struct t_list
InBlock.gif  {
InBlock.gif     char *p;
InBlock.gif     int len;
InBlock.gif  };
InBlock.gif   int m_currentOffset;
InBlock.gif
  std::list<t_list> m_DataList;
InBlock.gif  std::list<CDirentry> m_entryList;
InBlock.gif
  CLine *m_prevLine;
InBlock.gif
  CServer m_server;
InBlock.gif
   bool m_fileListOnly;
InBlock.gif  std::list<wxString> m_fileList;
InBlock.gif  
InBlock.gif   bool m_maybeMultilineVms;
InBlock.gif
  wxTimeSpan m_timezoneOffset;
InBlock.gif};
InBlock.gif
#endif









本文转自 h2appy  51CTO博客,原文链接:http://blog.51cto.com/h2appy/122279,如需转载请自行联系原作者
目录
相关文章
|
4天前
|
XML 存储 JavaScript
DOM 节点列表(Node List)
XML DOM允许访问XML文档的每个节点,提供三种访问方法:getElementsByTagName()、循环遍历和导航节点关系。getElementsByTagName()返回一个节点列表,类似于数组,可用于获取特定标签名的节点。例如,加载&quot;books.xml&quot;后,`xmlDoc.getElementsByTagName(&quot;title&quot;)`会获取所有&lt;title&gt;元素,存储在变量x中。通过索引如x[2]可访问第三个&lt;title&gt;元素,注意索引从0开始。后续章节将进一步探讨节点列表(Node List)。
|
2天前
|
XML JavaScript 数据格式
DOM 节点列表长度(Node List Length)
`length`属性用于获取DOM节点列表的元素数量。在示例中,加载&quot;books.xml&quot;后,通过getElementsByTagName(&quot;title&quot;)获取标题节点列表,然后使用`for`循环遍历列表,输出每个标题的文本内容。此代码演示了如何处理XML文档中的节点集合。
|
4天前
|
Linux 程序员 计算机视觉
【linux 学习】在Linux中经常用到的cmake、make、make install等命令解析
【linux 学习】在Linux中经常用到的cmake、make、make install等命令解析
16 0
|
4天前
|
监控 Linux 数据处理
|
4天前
|
JavaScript
DOM 节点列表长度(Node List Length)
`length`属性用于获取DOM节点列表的长度,允许遍历和处理节点。例如,加载&quot;books.xml&quot;到`xmlDoc`,通过`getElementsByTagName(&quot;title&quot;)`获取所有标题节点,然后使用循环输出每个&lt;title&gt;元素的文本内容。
|
4天前
|
XML 数据格式
节点列表长度(Node List Length)
`NodeList`对象自动更新,其`length`属性表示列表中节点数量。例如,加载&quot;books.xml&quot;后,`getElementsByTagName(&#39;title&#39;).length`返回`4`。此属性可用来遍历列表,如示例所示,遍历所有`&lt;title&gt;`元素并打印其文本内容:Everyday Italian, Harry Potter, XQuery Kick Start, Learning XML。
|
4天前
|
Python
【Python 基础】列表(list)和元组(tuple)有什么区别?
【5月更文挑战第6天】【Python 基础】列表(list)和元组(tuple)有什么区别?
|
4天前
|
XML 数据格式
节点列表长度(Node List Length)
`Node List`对象自动更新,其`length`属性表示列表中节点数量。在给定示例中,代码加载&quot;books.xml&quot;,并利用`getElementsByTagName(&#39;title&#39;)`获取&lt;title&gt;元素的数量,结果为4。`length`也可用于遍历列表,如示例所示,遍历并打印所有&lt;title&gt;元素的文本内容:Everyday Italian, Harry Potter, XQuery Kick Start, Learning XML。
|
4天前
|
XML JavaScript 数据格式
DOM 节点列表长度(Node List Length)
`length`属性用于获取DOM节点列表的长度,例如在XML文档中,通过`getElementsByTagName()`获取的所有`&lt;title&gt;`元素的数量。可以遍历这个列表,如示例所示,使用`for`循环访问并输出每个节点的文本值。在提供的实例中,代码加载&quot;books.xml&quot;,然后获取并打印所有&lt;title&gt;元素的子节点值。
|
4天前
|
弹性计算 运维 Shell

推荐镜像

更多
http://www.vxiaotou.com