Skip to content

Delete Collections

Overview

Delete one or more collections.

Pass in either the actual or friendly collection name.
Can't pass in collections that have chidren.
Use delete_collection_recursively instead.

Parameters:

Name Type Description Default
collection_names Union[str, list]

Collections to be deleted.

required
safe_delete Optional[str]

The client name to be used when printing the safe delete commands.

None
delete_assets bool

if True, will delete all of the assets from the collection.

False
delete_assets_timeout int

If delete_assets is True, this is the timeout for deleting the assets. If None, the default is 30 minutes.

30
force_actual_name bool

Edge Case. If multiple duplicate friendly names and one of the actual names is the name passed in.

False
api_version Optional[str]

If None, default is "2019-11-01-preview".

None

Returns:

Type Description
None

Ouptuts to the screen the collection has been deleted.

Source code in purviewautomation/collections.py
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
def delete_collections(
    self,
    collection_names: Union[str, list],
    safe_delete: Optional[str] = None,
    delete_assets: bool = False,
    delete_assets_timeout: int = 30,
    force_actual_name: bool = False,
    api_version: Optional[str] = None,
) -> None:
    """Delete one or more collections.

        Pass in either the actual or friendly collection name.
        Can't pass in collections that have chidren.
        Use delete_collection_recursively instead.

    Args:
        collection_names: Collections to be deleted.
        safe_delete: The client name to be used when printing the safe delete commands.
        delete_assets: if True, will delete all of the assets from the collection.
        delete_assets_timeout: If delete_assets is True, this is the timeout for deleting the assets.
            If None, the default is 30 minutes.
        force_actual_name: Edge Case. If multiple duplicate friendly names
            and one of the actual names is the name passed in.
        api_version: If None, default is "2019-11-01-preview".

    Returns:
        Ouptuts to the screen the collection has been deleted.
    """
    if not api_version:
        api_version = self.collections_api_version

    if not isinstance(collection_names, (str, list)):
        raise ValueError("The collection_names parameter has to either be a string or a list.")
    elif isinstance(collection_names, str):
        collection_names = [collection_names]

    if safe_delete:
        self._safe_delete(collection_names=collection_names, safe_delete_name=safe_delete)

    for name in collection_names:
        coll_name = self.get_real_collection_name(collection_name=name, force_actual_name=force_actual_name)
        child_collections_check = self.get_child_collection_names(coll_name)
        if child_collections_check["count"] > 0:
            err_msg = (
                f"The collection '{name}' has child collections. Can only delete collections that have no children. "
                "To delete collections and all of their children recursively, "
                f"use: delete_collections_recursively('{name}')"
            )
            raise ValueError(err_msg)

        url = f"{self.collections_endpoint}/{coll_name}?api-version={api_version}"
        try:
            colls = self.list_collections(only_names=True)
            friendly_name = colls[coll_name]["friendlyName"]
            if delete_assets:
                self.delete_collection_assets(collection_names=coll_name, timeout=delete_assets_timeout)
            delete_collections_request = requests.delete(url=url, headers=self.header)
            if not delete_collections_request.content:
                print(f"The collection '{friendly_name}' was successfully deleted")
                print("\n")
            else:
                print(delete_collections_request.content)
        except Exception as e:
            raise e

Important

  • This method only deletes collections that have no children. To delete collections with children (recursive delete), see Delete Collections Recursively
  • Collection names are case sensitive. My-Company is different than my-Company.

  • To delete collections that also have assets, add the delete_assets parameter see: Delete Assets Section

Examples

Delete One Collection

If the Purview collections look like this:

Collections

Delete Sub Collection 1:

client.delete_collections(collection_names="Sub Collection 1")

The output will be printed to the screen: Collections

Delete Multiple Collections

Deleting multiple collections under different hierarchies is also allowed. If the Purview collections look like this:

Collections

Delete Sub Collection 2 and Sub Collection 1 by passing in a list:

client.delete_collections(["Sub Collection 2", "Sub Collection 1"])

Rollback/Safe Delete

When deleting collections, passing in the safe_delete parameter will output the collection/s that were deleted in order to recreate the collection. Think of this as a rollback option.

If Purview looked like this:

Collections

Running the code:

client.delete_collections(collection_names="Collection To Delete", 
                          safe_delete="client")

Will delete the collection in Purview and output (print to the screen) the exact script to recreate the collection again. The same actual and friendly names are used:

Collections

Simply copy and run the code to recreate the collection:

client.create_collections(start_collection='tkhegu', 
                          collection_names='msvebq', 
                          safe_delete_friendly_name='Collection To Delete')

The Collection To Delete collection is recreated:

Collections

Delete Assets

To delete assets in a collection and delete the collection, use the delete_assets parameter with the optional delete_assets_timeout option:

Important

The Service Principal or user that authenticated/connected to Purview would need to be listed as a Data Curator on the collection in order to delete assets in that collection. For more info, see: Purview Roles

Deleting assets in a collection is irreversible. Re-scan the deleted assets to add them back to the collection.

The code will delete all the assets and the collection. To only delete assets in a collection and not delete the collection, see: Delete Collection Assets

The root collection (top level collection) can't be deleted. In the example above, purview-test-2 is the root collection. To only delete the assets, see: Delete Collection Assets

For example, the below collection Collection To Delete has 3 assets:

Collections

Run the code to delete all 3 assets in the collection and delete the collection as well:

client.delete_collections(collection_names="Collection To Delete",
                          delete_assets=True)

The delete_assets parameter has a default timeout of 30 mins. If the collection has a large number of assets, pass in an integer to the delete_assets_timeout parameter to specify a longer or shorter timeframe (in minutes).

For example, the below code will run up to an hour before timing out. The code will also stop when all the assets are deleted. If it only takes one minute to delete all of the assets, the code will stop after a minute:

client.delete_collections(collection_names="Collection To Delete",
                          delete_assets=True,
                          delete_assets_timeout=60)

Handling Duplicate Friendly Names

In the event there's multiple duplicate friendly names/edge cases, see: Handeling Multiple Duplicate Friendly Names.

In Purview, the real name (under the hood name) of a collection has to be unique but there can be duplicate friendly names under different hierarchies: Collections

In the above example, the friendly name Sub Finance Team appears under two different hierarchies. Under the hood, the two names will be different (different real names).

When trying to delete a collection (or collections) with multiple friendly names, a friendly error will be raised showing the info of the collections and to choose which real name to use:

client.delete_collections("Sub Finance Team")

Will raise a friendly error: Collections

From the above options, you can see the collection info and choose the real name of one of them (or both). Either zgnm71 or bn2azu. Below the Sub Finance Team collection under Finance will be deleted:

client.delete_collections(collection_names="zgnm71")

Collections

Edge Case Using the force_actual_name parameter

This is used when there are duplicate friendly names across different hierarchies and the real name of one of them is the name you're using. For example, in the below Purview there's two friendly names named test1:

Collections

Running the command below:

client.delete_collections("test1")

Could output the following (collections were created specifically under the hood in this example to raise this error.): Collections

In the above image, test1 is listed as a real name. When this rare edge case occurs, set the force_actual_name to True to delete the real test1 collection (under My-Company):

client.delete_collections("test1", force_actual_name=True)

Collections