当前位置：首页 > news >正文

专业网站开发服务青岛网络推广公司排名

news 2026/1/13 19:01:39

专业网站开发服务,青岛网络推广公司排名,phpcms网站转移,制作音乐的软件app首先确保环境的干净#xff0c;如果之前有安装过清理掉相关残留确保安装atlas的服务器有足够的内存#xff08;至少16G#xff09;#xff0c;有必要的hadoop角色 HDFS客户端 — 检索和更新Hadoop使用的用户组信息#xff08;UGI#xff09;中帐户成员资格的信息。对调… 首先确保环境的干净如果之前有安装过清理掉相关残留确保安装atlas的服务器有足够的内存至少16G有必要的hadoop角色 HDFS客户端 — 检索和更新Hadoop使用的用户组信息UGI中帐户成员资格的信息。对调试很有用。HBase Client - Atlas 存储其 Janus 数据库用于初始导入 HBase 内容因此它需要持续访问 HBase 服务中的两个表。Hive 客户端 - 用于初始导入 Hive 内容。准备编译环境 mvn3.8.8 必须3.8以上的版本 3.6无法编译 java 1.8.0_181 跟你的CDH环境保持一致 node node-v16.20.2 下载和解压缩源代码该项目的网站可以在这里找到 Apache Atlas – Data Governance and Metadata framework for Hadoop 查找并下载 Apache Atlas 更改pom.xml 在主pom就是文件夹打开第一个添加一个包含 maven 工件的 clouder 存储库 repositoryidcloudera/idurlhttps://repository.cloudera.com/artifactory/cloudera-repos/urlreleasesenabledtrue/enabled/releasessnapshotsenabledfalse/enabled/snapshots /repository然后修改对应的cdh组件版本 hadoop.version3.0.0-cdh6.3.2/hadoop.version hbase.version2.1.0-cdh6.3.2/hbase.version hive.version2.1.1-cdh6.3.2/hive.version kafka.scala.binary.version2.11/kafka.scala.binary.version kafka.version2.2.1-cdh6.3.2/kafka.version solr-test-framework.version7.4.0-cdh6.3.2/solr-test-framework.version lucene-solr.version7.4.0/lucene-solr.version solr.version7.4.0-cdh6.3.2/solr.version sqoop.version1.4.7-cdh6.3.2/sqoop.version zookeeper.version3.4.5-cdh6.3.2/zookeeper.version然后修改一些jar包的版本将“atlas-buildtools”工件的版本从“1.0”更改为“0.8.1”dependencygroupIdorg.apache.atlas/groupIdartifactIdatlas-buildtools/artifactIdversion0.8.1/version/dependency修改jsr.version为2.0.1jsr.version2.0.1/jsr.version 修改一些次pom 主目录下 grep -rn jsr311-apii | grep pom.xmladdons/impala-bridge/pom.xml:332 addons/falcon-bridge/pom.xml:178 addons/hive-bridge/pom.xml:312: addons/hbase-bridge/pom.xml:345: addons/storm-bridge/pom.xml:360: addons/sqoop-bridge/pom.xml:250:这几个pom中jsr311-api改成javax.ws.rs-api修改其他文件在文件 addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 中转到第618行注释”String catalogName hiveDB.getCatalogName null hiveDB.getCatalogName.toLowerCase null;“ 并添加 ”String catalogName null;“ public static String getDatabaseName(Database hiveDB) {String dbName hiveDB.getName().toLowerCase();//String catalogName hiveDB.getCatalogName() ! null ? hiveDB.getCatalogName().toLowerCase() : null;String catalogName null;if (StringUtils.isNotEmpty(catalogName) !StringUtils.equals(catalogName, DEFAULT_METASTORE_CATALOG)) {dbName catalogName SEP dbName;}return dbName; }在文件 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/AtlasHiveHookContext.java 中转到第83行”this.metastoreHandler listenerEvent null metastoreEvent.getIHMSHandler null;“ 注释它并添加”this.metastoreHandler null;“ public AtlasHiveHookContext(HiveHook hook, HiveOperation hiveOperation, HookContext hiveContext, HiveHookObjectNamesCache knownObjects,HiveMetastoreHook metastoreHook, ListenerEvent listenerEvent) throws Exception {this.hook hook;this.hiveOperation hiveOperation;this.hiveContext hiveContext;this.hive hiveContext ! null ? Hive.get(hiveContext.getConf()) : null;this.knownObjects knownObjects;this.metastoreHook metastoreHook;this.metastoreEvent listenerEvent;//this.metastoreHandler (listenerEvent ! null) ? metastoreEvent.getIHMSHandler() : null;this.metastoreHandler null;init(); }在文件addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/CreateHiveProcess.java 注释第 293 行提到“MATERIALIZED_VIEW” private boolean isDdlOperation(AtlasEntity entity) {return entity ! null !context.isMetastoreHook() (context.getHiveOperation().equals(HiveOperation.CREATETABLE_AS_SELECT)|| context.getHiveOperation().equals(HiveOperation.CREATEVIEW)|| context.getHiveOperation().equals(HiveOperation.ALTERVIEW_AS));//|| context.getHiveOperation().equals(HiveOperation.CREATE_MATERIALIZED_VIEW)); }注意这里要加号因为原来的符号被注释了在文件addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 注释提及“MATERIALIZED_VIEW”的第 212 行和第 217 行开始构建。基本无坑。有问题多试几次。有时候会因为网络问题下不到包 mvn clean -DskipTests package -Pdist -Drat.skiptrue 包在distro/target/apache-atlas-2.2.0-bin.tar.gz 不要用官方文档说的server包那个包没有各种hook文件解压到安装目录开始安装为atlas部署准备 CDH 集群服务 Atlas使用HBase来存储他的Janus数据库。Solr 用于存储和搜索审核日志。Kafka被用作从Atlas库即嵌入Hadoop服务中的钩子到Atlas本身的消息发送器。 1.1. 在 HBase 中创建必要的表在 Atlas 计算机或安装了“HBase 网关”角色的任何其他计算机上创建必要的表 TABLE1apache_atlas_entity_audit TABLE2apache_atlas_janusecho create ${TABLE1}, dt | hbase shell echo create ${TABLE2}, s | hbase shell检查已创建的表在 Atlas 计算机或安装了“HBase 网关”角色的任何其他计算机上执行 echo list | hbase shell 复制标准输出 Took 0.0028 seconds list TABLE apache_atlas_entity_audit apache_atlas_janus 2 row(s) Took 0.6872 seconds [apache_atlas_entity_audit, apache_atlas_janus]添加hbase集群配置文件到conf/hbase下 ln -s /etc/hbase/conf/ /data/apache-atlas-2.2.0/conf/hbase Apache Kafka Atlas 使用 Apache Kafka 接收有关 Hadoop 服务中发生的事件的消息。消息是使用嵌入在某些服务中的Atlas的特殊库发送的。目前Atlas 读取有关 Hbase 和 Hive 中事件的消息例如创建和删除表、添加列、等等等等...... 在 Kafka 中添加必要的topic Apache Atlas 需要 Apache Kafka 中的三个topic。在安装了 Kafka 的计算机上创建它们 kafka-topics --zookeeper S0:2181,S1:2181,S2:2181,S3:2181 --create --replication-factor 3 --partitions 3 --topic _HOATLASOK kafka-topics --zookeeper S0:2181,S1:2181,S2:2181,S3:2181 --create --replication-factor 3 --partitions 3 --topic ATLAS_ENTITIES kafka-topics --zookeeper S0:2181,S1:2181,S2:2181,S3:2181 --create --replication-factor 3 --partitions 3 --topic ATLAS_HOOK 有kerberos的会麻烦一点具体看这篇 Kerberos环境下命令行连接kafka 和zk_启用kerberos后zk_Mumunu-的博客-CSDN博客配置atlas的sentry role 以访问kafka topic 在具有“Kafka 网关”和“sentry网关”角色的机器上在sentry中创建“kafka4atlas_role”角色 KROLEkafka4atlas_rolekafka-sentry -cr -r ${KROLE} 将创建的角色分配给 atlas 组kafka-sentry -arg -r ${KROLE} -g atlas为消费者分配权限 TOPIC1_HOATLASOK TOPIC2ATLAS_ENTITIES TOPIC3ATLAS_HOOKkafka-sentry -gpr -r ${KROLE} -p Host*-CONSUMERGROUP*-actionread kafka-sentry -gpr -r ${KROLE} -p Host*-CONSUMERGROUP*-actiondescribekafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC1}-actionread kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC2}-actionread kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC3}-actionread kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC1}-actiondescribe kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC2}-actiondescribe kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC3}-actiondescribe 为生产者分配权限 kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC1}-actionwrite kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC2}-actionwrite kafka-sentry -gpr -r ${KROLE} -p HOST*-TOPIC${TOPIC3}-actionwrite 检查sentry设置$ kafka-sentry -lr .... solradm_role kafka4atlas_role 显示组及其分配角色的列表$ kafka-sentry -lg ... atlas kafka4atlas_role test2_solr_admins solradm_role 显示权限列表$ kafka-sentry -lp -r kafka4atlas_role ... HOST*-TOPIC_HOATLASOK-actionread HOST*-TOPIC_HOATLASOK-actiondescribe HOST*-TOPICATLAS_HOOK-actionread HOST*-TOPICATLAS_ENTITIES-actiondescribe HOST*-TOPICATLAS_HOOK-actiondescribe HOST*-CONSUMERGROUP*-actiondescribe HOST*-TOPIC_HOATLASOK-actionwrite HOST*-TOPICATLAS_ENTITIES-actionwrite HOST*-TOPICATLAS_HOOK-actionwrite HOST*-TOPICATLAS_ENTITIES-actionread HOST*-CONSUMERGROUP*-actionread 集成CDH的Solr ①将apache-atlas-2.1.0/conf/solr文件拷贝到solr的安装目录下即/opt/cloudera/parcels/CDh/lib/solr下然后更名为atlas-solr ②创建collection vi /etc/passwd /sbin/nologin 修改为 /bin/bash su - solr/opt/cloudera/parcels/CDH/lib/solr/bin/solr create -c vertex_index -d /opt/cloudera/parcels/CDH/lib/solr/atlas-solr -shards 3 -replicationFactor 2/opt/cloudera/parcels/CDH/lib/solr/bin/solr create -c edge_index -d /opt/cloudera/parcels/CDH/lib/solr/atlas-solr -shards 3 -replicationFactor 2/opt/cloudera/parcels/CDH/lib/solr/bin/solr create -c fulltext_index -d /opt/cloudera/parcels/CDH/lib/solr/atlas-solr -shards 3 -replicationFactor 2 ③验证创建collection成功登录 solr web控制台 http://xxxx:8983 验证是否启动成功创建好相关的kerberos帐号和keytab 修改atlas-application.properties ######### Graph Database Configs ########## Graph Database#Configures the graph database to use. Defaults to JanusGraph #atlas.graphdb.backendorg.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase# Graph Storage # Set atlas.graph.storage.backend to the correct value for your desired storage # backend. Possible values: # # hbase # cassandra # embeddedcassandra - Should only be set by building Atlas with -Pdist,embedded-cassandra-solr # berkeleyje # # See the configuration documentation for more information about configuring the various storage backends. # atlas.graph.storage.backendhbase atlas.graph.storage.hbase.tableapache_atlas_janus#Hbase #For standalone mode , specify localhost #for distributed mode, specify zookeeper quorum here atlas.graph.storage.hostnameS0:2181,S1:2181,S2:2181 atlas.graph.storage.hbase.regions-per-server1 atlas.graph.stoorage.lock.wait-time10000#In order to use Cassandra as a backend, comment out the hbase specific properties above, and uncomment the #the following properties #atlas.graph.storage.clustername #atlas.graph.storage.port# Gremlin Query Optimizer # # Enables rewriting gremlin queries to maximize performance. This flag is provided as # a possible way to work around any defects that are found in the optimizer until they # are resolved. #atlas.query.gremlinOptimizerEnabledtrue# Delete handler # # This allows the default behavior of doing soft deletes to be changed. # # Allowed Values: # org.apache.atlas.repository.store.graph.v1.SoftDeleteHandlerV1 - all deletes are soft deletes # org.apache.atlas.repository.store.graph.v1.HardDeleteHandlerV1 - all deletes are hard deletes # #atlas.DeleteHandlerV1.implorg.apache.atlas.repository.store.graph.v1.SoftDeleteHandlerV1# Entity audit repository # # This allows the default behavior of logging entity changes to hbase to be changed. # # Allowed Values: # org.apache.atlas.repository.audit.HBaseBasedAuditRepository - log entity changes to hbase # org.apache.atlas.repository.audit.CassandraBasedAuditRepository - log entity changes to cassandra # org.apache.atlas.repository.audit.NoopEntityAuditRepository - disable the audit repository # atlas.EntityAuditRepository.implorg.apache.atlas.repository.audit.HBaseBasedAuditRepository# if Cassandra is used as a backend for audit from the above property, uncomment and set the following # properties appropriately. If using the embedded cassandra profile, these properties can remain # commented out. # atlas.EntityAuditRepository.keyspaceatlas_audit # atlas.EntityAuditRepository.replicationFactor1# Graph Search Index atlas.graph.index.search.backendsolr#Solr #Solr cloud mode properties atlas.graph.index.search.solr.modecloud atlas.graph.index.search.solr.zookeeper-urlS0:2181/solr,S1:2181/solr,S2:2181/solr atlas.graph.index.search.solr.zookeeper-connect-timeout60000 atlas.graph.index.search.solr.zookeeper-session-timeout60000 atlas.graph.index.search.solr.wait-searchertrue#Solr http mode properties #atlas.graph.index.search.solr.modehttp #atlas.graph.index.search.solr.http-urlshttp://localhost:8983/solr# ElasticSearch support (Tech Preview) # Comment out above solr configuration, and uncomment the following two lines. Additionally, make sure the # hostname field is set to a comma delimited set of elasticsearch master nodes, or an ELB that fronts the masters. # # Elasticsearch does not provide authentication out of the box, but does provide an option with the X-Pack product # https://www.elastic.co/products/x-pack/security # # Alternatively, the JanusGraph documentation provides some tips on how to secure Elasticsearch without additional # plugins: https://docs.janusgraph.org/latest/elasticsearch.html #atlas.graph.index.search.hostnamelocalhost #atlas.graph.index.search.elasticsearch.client-onlytrue# Solr-specific configuration property atlas.graph.index.search.max-result-set-size150######### Import Configs ######### #atlas.import.temp.directory/temp/import######### Notification Configs ######### atlas.notification.embeddedfalse atlas.kafka.data${sys:atlas.home}/data/kafka atlas.kafka.zookeeper.connectS0:2181,S1:2181,S2:2181 atlas.kafka.bootstrap.serversS0:9092,S1:9092,S2:9092 atlas.kafka.zookeeper.session.timeout.ms60000 atlas.kafka.zookeeper.connection.timeout.ms60000 atlas.kafka.zookeeper.sync.time.ms20 atlas.kafka.auto.commit.interval.ms1000 atlas.kafka.hook.group.idatlasatlas.kafka.enable.auto.commitfalse atlas.kafka.auto.offset.resetearliest atlas.kafka.session.timeout.ms30000 atlas.kafka.offsets.topic.replication.factor1 atlas.kafka.poll.timeout.ms1000atlas.notification.create.topicstrue atlas.notification.replicas1 atlas.notification.topicsATLAS_HOOK,ATLAS_ENTITIES atlas.notification.log.failed.messagestrue atlas.notification.consumer.retry.interval500 atlas.notification.hook.retry.interval1000 # Enable for Kerberized Kafka clusters #atlas.notification.kafka.service.principalkafka/_HOSTEXAMPLE.COM #atlas.notification.kafka.keytab.location/etc/security/keytabs/kafka.service.keytab## Server port configuration atlas.server.http.port21000 #atlas.server.https.port21443######### Security Properties ########## SSL config atlas.enableTLSfalse#truststore.file/path/to/truststore.jks #cert.stores.credential.provider.pathjceks://file/path/to/credentialstore.jceks#following only required for 2-way SSL #keystore.file/path/to/keystore.jks# Authentication config atlas.authentication.methodkerberos atlas.authentication.keytab/data/hive.keytab atlas.authentication.principalhiveTEST.COMatlas.authentication.method.kerberostrue atlas.authentication.method.kerberos.principalhiveTEST.COM atlas.authentication.method.kerberos.keytab/data/hive.keytab atlas.authentication.method.kerberos.name.rulesRULE:[2:$1$0](hiveTEST.COM)s/.*/hive/ atlas.authentication.method.kerberos.token.validity3600#atlas.authentication.method.filetrue#### ldap.type LDAP or AD atlas.authentication.method.ldap.typenone#### user credentials file atlas.authentication.method.file.filename${sys:atlas.home}/conf/users-credentials.properties### groups from UGI #atlas.authentication.method.ldap.ugi-groupstrue######## LDAP properties ######### #atlas.authentication.method.ldap.urlldap://ldap server url:389 #atlas.authentication.method.ldap.userDNpatternuid{0},ouPeople,dcexample,dccom #atlas.authentication.method.ldap.groupSearchBasedcexample,dccom #atlas.authentication.method.ldap.groupSearchFilter(memberuid{0},ouUsers,dcexample,dccom) #atlas.authentication.method.ldap.groupRoleAttributecn #atlas.authentication.method.ldap.base.dndcexample,dccom #atlas.authentication.method.ldap.bind.dncnManager,dcexample,dccom #atlas.authentication.method.ldap.bind.passwordpassword #atlas.authentication.method.ldap.referralignore #atlas.authentication.method.ldap.user.searchfilter(uid{0}) #atlas.authentication.method.ldap.default.roledefault role######### Active directory properties ####### #atlas.authentication.method.ldap.ad.domainexample.com #atlas.authentication.method.ldap.ad.urlldap://AD server url:389 #atlas.authentication.method.ldap.ad.base.dn(sAMAccountName{0}) #atlas.authentication.method.ldap.ad.bind.dnCNteam,CNUsers,DCexample,DCcom #atlas.authentication.method.ldap.ad.bind.passwordpassword #atlas.authentication.method.ldap.ad.referralignore #atlas.authentication.method.ldap.ad.user.searchfilter(sAMAccountName{0}) #atlas.authentication.method.ldap.ad.default.roledefault role######### JAAS Configuration ########atlas.jaas.KafkaClient.loginModuleNamecom.sun.security.auth.module.Krb5LoginModule atlas.jaas.KafkaClient.loginModuleControlFlagrequired atlas.jaas.KafkaClient.option.useKeyTabtrue atlas.jaas.KafkaClient.option.storeKeytrue atlas.jaas.KafkaClient.option.serviceNamekafka atlas.jaas.KafkaClient.option.keyTab/data/atlas.service.keytab atlas.jaas.KafkaClient.option.principalatlas/s1.hadoop.comTEST.COMatlas.jaas.Client.loginModuleNamecom.sun.security.auth.module.Krb5LoginModule atlas.jaas.Client.loginModuleControlFlagrequired atlas.jaas.Client.option.useKeyTabtrue atlas.jaas.Client.option.storeKeytrue atlas.jaas.Client.option.keyTab/data/atlas.service.keytab atlas.jaas.Client.option.principalatlas/s1.hadoop.comTEST.COM######### Server Properties ######### atlas.rest.addresshttp://localhost:21000 # If enabled and set to true, this will run setup steps when the server starts #atlas.server.run.setup.on.startfalse######### Entity Audit Configs ######### atlas.audit.hbase.tablenameapache_atlas_entity_audit atlas.audit.zookeeper.session.timeout.ms1000 atlas.audit.hbase.zookeeper.quorumS0:2181,S1:2181,S2:2181######### High Availability Configuration ######## atlas.server.ha.enabledfalse #### Enabled the configs below as per need if HA is enabled ##### #atlas.server.idsid1 #atlas.server.address.id1localhost:21000 #atlas.server.ha.zookeeper.connectlocalhost:2181 #atlas.server.ha.zookeeper.retry.sleeptime.ms1000 #atlas.server.ha.zookeeper.num.retries3 #atlas.server.ha.zookeeper.session.timeout.ms20000 ## if ACLs need to be set on the created nodes, uncomment these lines and set the values ## #atlas.server.ha.zookeeper.aclscheme:id #atlas.server.ha.zookeeper.authscheme:authinfo######### Atlas Authorization ######### atlas.authorizer.implsimple atlas.authorizer.simple.authz.policy.fileatlas-simple-authz-policy.json######### Type Cache Implementation ######## # A type cache class which implements # org.apache.atlas.typesystem.types.cache.TypeCache. # The default implementation is org.apache.atlas.typesystem.types.cache.DefaultTypeCache which is a local in-memory type cache. #atlas.TypeCache.impl######### Performance Configs ######### #atlas.graph.storage.lock.retries10 #atlas.graph.storage.cache.db-cache-time120000######### CSRF Configs ######### atlas.rest-csrf.enabledtrue atlas.rest-csrf.browser-useragents-regex^Mozilla.*,^Opera.*,^Chrome.* atlas.rest-csrf.methods-to-ignoreGET,OPTIONS,HEAD,TRACE atlas.rest-csrf.custom-headerX-XSRF-HEADER############ KNOX Configs ################ #atlas.sso.knox.browser.useragentMozilla,Chrome,Opera #atlas.sso.knox.enabledtrue #atlas.sso.knox.providerurlhttps://knox gateway ip:8443/gateway/knoxsso/api/v1/websso #atlas.sso.knox.publicKey############ Atlas Metric/Stats configs ################ # Format: atlas.metric.query.key.name atlas.metric.query.cache.ttlInSecs900 #atlas.metric.query.general.typeCount #atlas.metric.query.general.typeUnusedCount #atlas.metric.query.general.entityCount #atlas.metric.query.general.tagCount #atlas.metric.query.general.entityDeleted # #atlas.metric.query.entity.typeEntities #atlas.metric.query.entity.entityTagged # #atlas.metric.query.tags.entityTags######### Compiled Query Cache Configuration ########## The size of the compiled query cache. Older queries will be evicted from the cache # when we reach the capacity.#atlas.CompiledQueryCache.capacity1000# Allows notifications when items are evicted from the compiled query # cache because it has become full. A warning will be issued when # the specified number of evictions have occurred. If the eviction # warning threshold 0, no eviction warnings will be issued.#atlas.CompiledQueryCache.evictionWarningThrottle0######### Full Text Search Configuration ##########Set to false to disable full text search. #atlas.search.fulltext.enabletrue######### Gremlin Search Configuration ##########Set to false to disable gremlin search. atlas.search.gremlin.enablefalse########## Add http headers ############atlas.headers.Access-Control-Allow-Origin* #atlas.headers.Access-Control-Allow-MethodsGET,OPTIONS,HEAD,PUT,POST #atlas.headers.headerNameheaderValue######### UI Configuration ########atlas.ui.default.versionv1 要改的配置很多。。务必仔细核对。很多默认配置都是有问题的,keytab 新建或者复用都可以担心可能会涉及到权限问题所以我选择了hive的账户。hbase中应该也需要配置相应的权限。没测试过是否需要配置修改atlas-env.sh #!/usr/bin/env bash# The java implementation to use. If JAVA_HOME is not found we expect java and jar to be in path export JAVA_HOME/usr/java/default export HBASE_CONF_DIR/etc/hbase/conf # any additional java opts you want to set. This will apply to both client and server operations #export ATLAS_OPTS# any additional java opts that you want to set for client only #export ATLAS_CLIENT_OPTS# java heap size we want to set for the client. Default is 1024MB #export ATLAS_CLIENT_HEAP# any additional opts you want to set for atlas service. #export ATLAS_SERVER_OPTS# indicative values for large number of metadata entities (equal or more than 10,000s) export ATLAS_SERVER_OPTS-server -XX:SoftRefLRUPolicyMSPerMB0 -XX:CMSClassUnloadingEnabled -XX:UseConcMarkSweepGC -XX:CMSParallelRemarkEnabled -XX:PrintTenuringDistribution -XX:HeapDumpOnOutOfMemoryError -XX:HeapDumpPathdumps/atlas_server.hprof -Xloggc:logs/gc-worker.log -verbose:gc -XX:UseGCLogFileRotation -XX:NumberOfGCLogFiles10 -XX:GCLogFileSize1m -XX:PrintGCDetails -XX:PrintHeapAtGC -XX:PrintGCTimeStamps -Djava.security.krb5.conf/etc/krb5.conf -Djava.security.auth.login.config/data/atlas2.2/conf/jaas.conf# java heap size we want to set for the atlas server. Default is 1024MB #export ATLAS_SERVER_HEAP# indicative values for large number of metadata entities (equal or more than 10,000s) for JDK 8 export ATLAS_SERVER_HEAP-Xms15360m -Xmx15360m -XX:MaxNewSize5120m -XX:MetaspaceSize100M -XX:MaxMetaspaceSize512m# What is is considered as atlas home dir. Default is the base locaion of the installed software export ATLAS_HOME_DIR/opt/atlas2.2# Where log files are stored. Defatult is logs directory under the base install location #export ATLAS_LOG_DIR# Where pid files are stored. Defatult is logs directory under the base install location #export ATLAS_PID_DIR# where the atlas titan db data is stored. Defatult is logs/data directory under the base install location #export ATLAS_DATA_DIR# Where do you want to expand the war file. By Default it is in /server/webapp dir under the base install dir. #export ATLAS_EXPANDED_WEBAPP_DIR# indicates whether or not a local instance of HBase should be started for Atlas export MANAGE_LOCAL_HBASEfalse# indicates whether or not a local instance of Solr should be started for Atlas export MANAGE_LOCAL_SOLRfalse# indicates whether or not cassandra is the embedded backend for Atlas export MANAGE_EMBEDDED_CASSANDRAfalse# indicates whether or not a local instance of Elasticsearch should be started for Atlas export MANAGE_LOCAL_ELASTICSEARCHfalseenv中的jaas.conf 需要增加一个jaas.conf Client {com.sun.security.auth.module.Krb5LoginModule requireduseKeyTabtrueKeyTab/data/atlas.service.keytabstoreKeytrueprincipalatlas/s1.hadoop.comTEST.COMdebugfalse; }; 集成hive 首先去CDH的hive上添加3处配置 HiveServer2 的 Java 配置选项 {{JAVA_GC_ARGS}} -Datlas.conf/data/apache-atlas-2.2.0/conf/ hive-site.xml的HiveServer2 高级配置代码段安全阀名称 hive.exec.post.hooks 值 org.apache.atlas.hive.hook.HiveHook HiveServer2 环境高级配置片段安全阀 HIVE_AUX_JARS_PATH/data/apache-atlas-2.2.0/hook/hive/ 复制一份atlas-application.properties到/etc/hive/conf下。注意需要修改改为false atlas.authentication.method.kerberosfalse 增加 atlas.client.readTimeoutMSecs90000 atlas.client.connectTimeoutMSecs90000 最后两个配置的含义是读取连接时间默认的太短然后就可以启动了 bin/atlas-start.pybin/atlas-stop.py 启动过程如下图所示该过程会耗时较久包含index创建、数据的初始化等操作可能长达数小时请耐心等待。此时可以跟一下atlas的启动日志直到日志不再刷新再lsof或netstat查一下21000是否已经监听了如已存在则打开浏览器输入ip:21000登录atlas页面千万不要相信他提示的Apache Atlas Server started!!!和jps显示的Atlas进程因为启动脚本超过一定时间后一定会报成功但此时21000端口还未被监听服务是不可用的真正可用还是以21000被成功监听可以进到Atlas登录页面为准然后开始正式使用导入hive数据记得kinit bin/import-hive.sh 也可以单独导入某个库 bin/import-hive.sh -d default 过程中会提示输入atlas用户名和密码都输入admin即可成功后会提示该过程时间视hive现有数据量大小而定登录后如下图此时可以点击右上角小图标查看总体数据情况查看所有hive表随便点击一个表查看详情可以清楚地看到这个表的各项信息、字段及血缘图等我们也可以通过左侧搜索栏检索过滤想要查找的项

查看全文

http://www.yutouwan.com/news/336168/