Apache Atlas元数据血缘关系(Lineage)功能研究
一、生成血缘数据血缘关系数据通过Process生成,可以在数据导入时自动生成或通过RestAPI新增Process生成。1、sqoop同步自动生成血缘数据sqoop同步MySQL数据库数据到hive,同步成功后,通过sqoop的Atlas Hook自动生成血缘数据。sqoop将MySQL数据库所有表数据同步到hive仓库命令:sqoop import-all-tables --connect jd
一、生成血缘数据
血缘关系数据通过Process生成,可以在数据导入时自动生成或通过RestAPI新增Process生成。
1、sqoop同步自动生成血缘数据
sqoop同步MySQL数据库数据到hive,同步成功后,通过sqoop的Atlas Hook自动生成血缘数据。
sqoop将MySQL数据库所有表数据同步到hive仓库命令:
sqoop import-all-tables --connect jdbc:mysql://192.168.1.1:3306/testdb --username root --password ****** --hive-import --hive-database testdb --m 1
Atlas管理台可以查看到每张表的血缘关系图:
2、RestAPI接口生成血缘数据
通过Atlas的RestAPI接口新增Process,可以生成血缘数据。
例如将Atlas元数据管理的MySQL数据库表和hive数据表关联生成血缘数据,先查到两张表的guid值,然后构造请求数据调用接口:http://{atlas_host}:21000/api/atlas/v2/entity/bulk
请求消息:
{"entities":[{"typeName":"Process","attributes":{"owner":"root","createTime":"2020-05-07T10:32:21.0Z","updateTime":"","qualifiedName":"people@process@mysql://192.168.1.1:3306","name":"peopleProcess","description":"people Process","comment":"test people Process","contact_info":"jdbc","type":"table","inputs":[{"guid": "5a676b74-e058-4e81-bcf8-42d73f4c1729","typeName": "rdbms_table"}],"outputs":[{"guid": "2e7c70e1-5a8a-4430-859f-c46d267e33fd","typeName": "hive_table"}]}}]}
Atlas管理台可以查看到表的血缘关系图:
3、hive建表语句自动生成血缘数据
hive执行hive SQL语句create table t2 as select id, name from T1创建表,会自动生成表的血缘数据以及字段级的血缘数据。
Hive 2.2.0以下的低版本存在bug,字段级的血缘数据不能自动生成,需升级hive版本到2.2.0及以上才能正常生成字段级的血缘数据。
Atlas管理台可以查看到表的血缘关系图:
字段(列)级血缘图:
4、多个Process联结的血缘图
二、管理血缘数据
1、Rest API查询血缘数据
get请求:http://{atlas_host}:21000/api/atlas/v2/lineage/01d12e5f-1ef5-46a8-ac13-29be71e8f78e
响应消息:
{"baseEntityGuid":"01d12e5f-1ef5-46a8-ac13-29be71e8f78e","lineageDirection":"BOTH","lineageDepth":3,"guidEntityMap":{"5a676b74-e058-4e81-bcf8-42d73f4c1729":{"typeName":"rdbms_table","attributes":{"owner":"root","createTime":1577687198000,"qualifiedName":"testdb.p_people@mysql://192.168.1.1:3306","name":"p_people","description":"MySQL数据库表:testdb.p_people"},"guid":"5a676b74-e058-4e81-bcf8-42d73f4c1729","status":"ACTIVE","displayText":"p_people","classificationNames":[],"meaningNames":[],"meanings":[]},"2e7c70e1-5a8a-4430-859f-c46d267e33fd":{"typeName":"hive_table","attributes":{"owner":"hdfs","createTime":1578981817000,"qualifiedName":"testdb.p_people@primary","name":"p_people"},"guid":"2e7c70e1-5a8a-4430-859f-c46d267e33fd","status":"ACTIVE","displayText":"p_people","classificationNames":["people"],"meaningNames":[],"meanings":[]},"2b65eb7f-596e-48f0-a94d-240e56a4da93":{"typeName":"Process","attributes":{"owner":"root","qualifiedName":"people@process@mysql://192.168.1.1:3306","name":"peopleProcess","description":"people Process"},"guid":"2b65eb7f-596e-48f0-a94d-240e56a4da93","status":"ACTIVE","displayText":"peopleProcess","classificationNames":[],"meaningNames":[],"meanings":[]},"01d12e5f-1ef5-46a8-ac13-29be71e8f78e":{"typeName":"hive_process","attributes":{"qualifiedName":"testdb.p_people_tmp2@primary:1588921268000","name":"create table p_people_tmp2 as select peopleid,peopletype,credentialtype,credentialno,peoplename,gender,nation from p_people"},"guid":"01d12e5f-1ef5-46a8-ac13-29be71e8f78e","status":"ACTIVE","displayText":"create table p_people_tmp2 as select peopleid,peopletype,credentialtype,credentialno,peoplename,gender,nation from p_people","classificationNames":["people"],"meaningNames":[],"meanings":[]},"a4ccceb2-a52c-46a2-b4fd-27d26b8aad3f":{"typeName":"hive_table","attributes":{"owner":"hive","createTime":1588921268000,"qualifiedName":"testdb.p_people_tmp2@primary","name":"p_people_tmp2"},"guid":"a4ccceb2-a52c-46a2-b4fd-27d26b8aad3f","status":"ACTIVE","displayText":"p_people_tmp2","classificationNames":["people"],"meaningNames":[],"meanings":[]}},"relations":[{"fromEntityId":"01d12e5f-1ef5-46a8-ac13-29be71e8f78e","toEntityId":"a4ccceb2-a52c-46a2-b4fd-27d26b8aad3f","relationshipId":"148cc83d-5b67-4174-91e4-767509483e13"},{"fromEntityId":"2e7c70e1-5a8a-4430-859f-c46d267e33fd","toEntityId":"01d12e5f-1ef5-46a8-ac13-29be71e8f78e","relationshipId":"eb768346-d32a-40f9-bf04-d23abbcc3221"},{"fromEntityId":"2b65eb7f-596e-48f0-a94d-240e56a4da93","toEntityId":"2e7c70e1-5a8a-4430-859f-c46d267e33fd","relationshipId":"bea47efd-2645-4d8a-ba6b-8f4ef9bb7316"},{"fromEntityId":"5a676b74-e058-4e81-bcf8-42d73f4c1729","toEntityId":"2b65eb7f-596e-48f0-a94d-240e56a4da93","relationshipId":"517db5b7-f537-4e66-97f1-33c2863fb440"}]}
2、管理界面查看血缘图
可以在Atlas管理台每个实体详情的Lineage选项卡页面查看血缘图:
界面上有几个功能按钮可以操作,依次是重排血缘图、导出png图片、设置hover事件显示当前路径或节点详情、隐藏过滤、节点搜索、放大、缩小、全屏:
开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!
更多推荐
所有评论(0)