Getting All Records in a Graph
Get all records in a graph using the build_graph
function.
- async geneagrapher_core.traverse.build_graph(start_items, *, http_semaphore=None, max_records=None, user_agent=None, cache=None, record_callback=None, report_callback=None)
Build a complete geneagraph using the
start_nodes
as the graph’s leaf nodes.- Parameters:
start_items (
List
[TraverseItem
]) – a list of nodes and direction from which to traverse from themhttp_semaphore (
Optional
[Semaphore
]) – a semaphore to limit HTTP request concurrencymax_records (
Optional
[int
]) – the maximum number of records to include in the built graphuser_agent (
Optional
[str
]) – a custom user agent string to use in HTTP requestscache (
Optional
[Cache
]) – a cache object for getting and storing resultsrecord_callback (
Optional
[Callable
[[TaskGroup
,Record
],Awaitable
[None
]]]) – callback function called with record data as it is retrievedreport_callback (
Optional
[Callable
[[TaskGroup
,int
,int
,int
],Awaitable
[None
]]]) – callback function called to report graph-building progress
- Return type:
Example:
# Build a graph that contains Carl Friedrich Gauß (18231), his advisor tree, # Johann Friedrich Pfaff (18230), his advisor tree, and his descendant tree. start_items = [ TraverseItem(RecordId(18231), TraverseDirection.ADVISORS), TraverseItem( RecordId(18230), TraverseDirection.ADVISORS | TraverseDirection.DESCENDANTS ), ] graph = await build_graph(start_items)
Callbacks
Report callback
The build_graph
function optionally takes a
reporting callback function. If provided, this function will be called
when new records are added to the traversal plan or when records have
been retrieved.
The report_callback
function is called with:
An
asyncio.TaskGroup
, which is useful if you want to do something expensive in the reporting callback and do not want to block the graph-building path.Three integers that report:
The number of known records yet to be retrieved.
The number of records in the process of being retrieved.
The number of records that have been retrieved.
Examples
Here’s an example of a simple, blocking callback:
async def show_progress(
tg: asyncio.TaskGroup, to_fetch: int, fetching: int, fetched: int
) -> None:
print(f"Todo: {to_fetch} Doing: {fetching} Done: {fetched}")
Here’s a more complicated example where you might want to create a new task to complete the reporting. Doing so keeps the reporting callback from blocking progress on data retrieval.
async def do_expensive_network_request(
to_fetch: int, fetching: int, fetched: int
) -> None:
# Do something that takes a long time.
async def show_progress(
tg: asyncio.TaskGroup, to_fetch: int, fetching: int, fetched: int
) -> None:
tg.create_task(do_expensive_network_request(to_fetch, fetching, fetched))
Record callback
The build_graph
function optionally takes a
record callback function. If provided, this function will be called
when a record has been retrieved. The callback function receives the
record data as an argument.
The record_callback
function is called with:
An
asyncio.TaskGroup
, which is useful if you want to do something expensive in the reporting callback and do not want to block the graph-building path.A
Record
object containing the record data.
Examples
Here’s an example of a simple, blocking callback:
async def got_record(tg: asyncio.TaskGroup, record: Record) -> None:
print(record)
Here’s a more complicated example where you might want to create a new task. Doing so keeps the record callback from blocking progress on data retrieval.
async def do_expensive_network_request(record: Record) -> None:
# Do something that takes a long time.
async def got_record(tg: asyncio.TaskGroup, record: Record) -> None:
tg.create_task(do_expensive_network_request(record))
Example Code
An example of how to use the cache
and report_callback
arguments to build_graph
is in the repository’s
examples directory.