pidibble.pdbparse module¶
- class pidibble.pdbparse.PDBParser(input_format: str = 'PDB', overwrite: bool = False, source_db: str = None, source_id: str = None, filepath: str | Path = None, mappers: dict[str, Callable] = None, comment_chars: list[str] = ['#'], pdb_format_file: str = 'pdb_format.yaml', mmcif_format_file: str = 'mmcif_format.yaml', **kwargs)[source]¶
Bases:
objectA class for parsing PDB files and extracting structured data. This class handles fetching PDB files, reading them, and parsing their contents into structured records based on predefined formats.
- parsed¶
A dictionary containing parsed records, where keys are record types and values are
pdbrecord.PDBRecordinstances or lists of instances. This dictionary is populated after parsing the PDB or mmCIF file.- Type:
- mappers¶
A dictionary of mappers for parsing different data types, including custom formats and delimiters.
- Type:
- cif_data¶
A dictionary containing the parsed mmCIF data. Empty if no input file is provided.
- Type:
- fetch()[source]¶
Fetch the PDB file based on the provided PDB code or AlphaFold ID. This method checks if the PDB code or AlphaFold ID is provided, constructs the appropriate file path, and attempts to download the file from the PDB or AlphaFold API.
- Returns:
True if the file was successfully fetched, False otherwise.
- Return type:
- parse()[source]¶
Parse the PDB or mmCIF file and generate a dictionary of
pdbrecord.PDBRecordinstances. This method first fetches the PDB or mmCIF file based on the provided PDB code or AlphaFold ID. It then reads the file and parses its contents into structured records. If the input format is mmCIF, it uses themmcif_parse.MMCIF_Parserto parse the mmCIF data. If the input format is PDB, it uses thepdbrecord.PDBRecordclass to parse the PDB lines.- Returns:
self – The instance of
pdbrecord.PDBRecordcontaining the parsed records.- Return type:
- parse_PDB()[source]¶
Parse the PDB lines and generate a dictionary of
pdbrecord.PDBRecordinstances. This method iterates through the PDB lines, identifies the record type based on the first character, and creates a newpdbrecord.PDBRecordinstance for each record. It handles different record types, including continuation records and grouped records.
- parse_base()[source]¶
Parse the base records from the PDB or mmCIF file. This method initializes the parsing process based on the input format. If the input format is mmCIF, it uses the
mmcif_parse.MMCIF_Parserto parse the mmCIF data. If the input format is PDB, it uses thepdbrecord.PDBRecordclass to parse the PDB lines.
- parse_embedded_records()[source]¶
Parse embedded records within the parsed records. This method iterates through the parsed records and checks if any record has embedded records. If an embedded record is found, it calls the
pdbrecord.PDBRecord.parse_embedded()method to parse the embedded records. It updates thePDBParser.parseddictionary with the new parsed records.
- parse_mmCIF()[source]¶
Parse the mmCIF data and generate a dictionary of
pdbrecord.PDBRecordinstances. This method uses themmcif_parse.MMCIF_Parserto parse the mmCIF data and store the parsed records inPDBParser.parsed.
- parse_tables()[source]¶
Parse tables within the parsed records. This method iterates through the parsed records and checks if any record has table formats. If a table format is found, it calls the
pdbrecord.PDBRecord.parse_tables()method to parse the tables. It updates thePDBParser.parseddictionary with the new parsed records.
- parse_tokens()[source]¶
Parse tokens within the parsed records. This method iterates through the parsed records and checks if any record has token formats. If a token format is found, it calls the
pdbrecord.PDBRecord.parse_tokens()method to parse the tokens. It updates thePDBParser.parseddictionary with the new parsed records.
- post_process()[source]¶
Post-process the parsed records to handle embedded records, tokens, and tables. This method checks if the input format is mmCIF and processes the records accordingly. If the input format is PDB, it processes the records to handle embedded records, tokens, and tables.
- read()[source]¶
Read the PDB or mmCIF file based on the input format. This method checks the input format and calls the appropriate read method.
- read_PDB()[source]¶
Read the PDB file and store its lines in
PDBParser.pdb_lines. This method opens the PDB file, reads its contents, and splits it into lines. If the last line is empty, it removes it from the list of lines.
- read_mmCIF()[source]¶
Read the mmCIF file and store its data in
PDBParser.cif_data. This method uses themmcif.io.IoAdapterCore.IoAdapterCoreto read the mmCIF file and store the data inPDBParser.cif_data.
- pidibble.pdbparse.get_symm_ops(rec: PDBRecord)[source]¶
Extract the symmetry operations from a PDB record. This function processes the symmetry operations from a PDB record and returns the transformation matrix and translation vector.
- Parameters:
rec (
pdbrecord.PDBRecord) – The PDBRecord instance containing the symmetry operations.- Returns:
M (
numpy.ndarray) – The 3x3 transformation matrix.T (
numpy.ndarray) – The 3x1 translation vector.