在hive中执行以下SQL报错
select count(device_id_md5) from dmp_device_app LATERAL VIEW explode(devices) tb_devices as device_id_md5,flag where source_appid_type < '1'
(1)详细信息如下:
hive> select count(device_id_md5) from dmp_device_app LATERAL VIEW explode(devices) tb_devices as device_id_md5,flag where source_appid_type < '1'; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = hadoop_20170831100418_3f321d81-eda9-4432-9ce3-1c28be27dc84 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1502939269753_4159, Tracking URL = http://master1:9999/proxy/application_1502939269753_4159/ Kill Command = /data0/soft/hadoop-2.6.0-cdh5.5.0/bin/hadoop job -kill job_1502939269753_4159 Hadoop job information for Stage-1: number of mappers: 16; number of reducers: 1 2017-08-31 10:04:25,902 Stage-1 map = 0%, reduce = 0% 2017-08-31 10:04:32,205 Stage-1 map = 6%, reduce = 0%, Cumulative CPU 4.3 sec 2017-08-31 10:04:33,245 Stage-1 map = 38%, reduce = 0%, Cumulative CPU 47.0 sec 2017-08-31 10:04:34,286 Stage-1 map = 44%, reduce = 0%, Cumulative CPU 56.87 sec 2017-08-31 10:04:35,325 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 67.47 sec 2017-08-31 10:04:42,605 Stage-1 map = 50%, reduce = 17%, Cumulative CPU 188.38 sec 2017-08-31 10:04:45,717 Stage-1 map = 56%, reduce = 17%, Cumulative CPU 217.13 sec 2017-08-31 10:04:48,820 Stage-1 map = 56%, reduce = 19%, Cumulative CPU 240.6 sec 2017-08-31 10:05:16,771 Stage-1 map = 63%, reduce = 19%, Cumulative CPU 395.2 sec 2017-08-31 10:05:19,878 Stage-1 map = 63%, reduce = 21%, Cumulative CPU 417.4 sec 2017-08-31 10:05:22,979 Stage-1 map = 69%, reduce = 21%, Cumulative CPU 435.62 sec 2017-08-31 10:05:26,091 Stage-1 map = 69%, reduce = 23%, Cumulative CPU 453.44 sec 2017-08-31 10:05:36,453 Stage-1 map = 75%, reduce = 23%, Cumulative CPU 435.83 sec 2017-08-31 10:05:38,517 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 458.08 sec 2017-08-31 10:06:05,420 Stage-1 map = 81%, reduce = 25%, Cumulative CPU 600.16 sec 2017-08-31 10:06:06,453 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 320.84 sec MapReduce Total cumulative CPU time: 5 minutes 20 seconds 840 msec Ended Job = job_1502939269753_4159 with errors Error during job, obtaining debugging information... Examining task ID: task_1502939269753_4159_m_000010 (and more) from job job_1502939269753_4159 Examining task ID: task_1502939269753_4159_m_000006 (and more) from job job_1502939269753_4159 Task with the most failures(4): ----- Task ID: task_1502939269753_4159_m_000000 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1502939269753_4159&tipid=task_1502939269753_4159_m_000000 ----- Diagnostic Messages for this Task: Container [pid=16249,containerID=container_1502939269753_4159_01_000025] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.9 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1502939269753_4159_01_000025 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 16249 16246 16249 16249 (bash) 0 0 108650496 299 /bin/bash -c /data0/soft/java//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/data9/hadoop/dfs/nm-local-dir/usercache/hadoop/appcache/application_1502939269753_4159/container_1502939269753_4159_01_000025/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/data0/hadoop/dfs/logs/userlogs/application_1502939269753_4159/container_1502939269753_4159_01_000025 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 30.16.95.134 49860 attempt_1502939269753_4159_m_000000_3 25 1>/data0/hadoop/dfs/logs/userlogs/application_1502939269753_4159/container_1502939269753_4159_01_000025/stdout 2>/data0/hadoop/dfs/logs/userlogs/application_1502939269753_4159/container_1502939269753_4159_01_000025/stderr |- 16281 16249 16249 16249 (java) 3680 131 3057590272 262357 /data0/soft/java//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/data9/hadoop/dfs/nm-local-dir/usercache/hadoop/appcache/application_1502939269753_4159/container_1502939269753_4159_01_000025/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/data0/hadoop/dfs/logs/userlogs/application_1502939269753_4159/container_1502939269753_4159_01_000025 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 30.16.95.134 49860 attempt_1502939269753_4159_m_000000_3 25 Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 16 Reduce: 1 Cumulative CPU: 320.84 sec HDFS Read: 84606 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 5 minutes 20 seconds 840 msec
(2)对应Hadoop的后台信息:
2017-08-31 10:04:55,621 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Process tree for container: container_1502939269 753_4159_01_000009 has processes older than 1 iteration running over the configured limit. Limit=1073741824, current usage = 1074794496 2017-08-31 10:04:55,621 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Container [pid=36129,containerID=container_15029 39269753_4159_01_000009] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 3.0 GB of 2.1 GB virtual memory used. Killing co ntainer. Dump of the process-tree for container_1502939269753_4159_01_000009 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 36129 36117 36129 36129 (bash) 0 0 108650496 299 /bin/bash -c /data0/soft/java//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN - Xmx820m -Djava.io.tmpdir=/data3/hadoop/dfs/nm-local-dir/usercache/hadoop/appcache/application_1502939269753_4159/container_1502939269753_4159_01_000009/tmp -Dlog4j.con figuration=container-log4j.properties -Dyarn.app.container.log.dir=/data0/hadoop/dfs/logs/userlogs/application_1502939269753_4159/container_1502939269753_4159_01_00000 9 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 30.16.95.134 49860 attempt_1502939269753_4159_m_000000_0 9 1>/da ta0/hadoop/dfs/logs/userlogs/application_1502939269753_4159/container_1502939269753_4159_01_000009/stdout 2>/data0/hadoop/dfs/logs/userlogs/application_1502939269753_4 159/container_1502939269753_4159_01_000009/stderr |- 36406 36129 36129 36129 (java) 5713 138 3136937984 262102 /data0/soft/java//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820 m -Djava.io.tmpdir=/data3/hadoop/dfs/nm-local-dir/usercache/hadoop/appcache/application_1502939269753_4159/container_1502939269753_4159_01_000009/tmp -Dlog4j.configura tion=container-log4j.properties -Dyarn.app.container.log.dir=/data0/hadoop/dfs/logs/userlogs/application_1502939269753_4159/container_1502939269753_4159_01_000009 -Dya rn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 30.16.95.134 49860 attempt_1502939269753_4159_m_000000_0 9 2017-08-31 10:04:55,621 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Removed ProcessTree with root 36129 2017-08-31 10:04:55,621 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1502939269753_4159_01_000009 transitio ned from RUNNING to KILLING 2017-08-31 10:04:55,621 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1502939269753_4159_01 _000009
对应关键信息:
Hive信息
Current usage: 1.0 GB of 1 GB physical memory used; 2.9 GB of 2.1 GB virtual memory used. Killing container, 虚拟内存都超出预值,被kill.
Hadoop信息
39269753415901_000009] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 3.0 GB of 2.1 GB virtual memory used. Killing co ntainer. Dump of the process-tree for container1502939269753415901000009 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 36129 36117 36129 36129 (bash) 0 0 108650496 299 /bin/bash -c /data0/soft/java//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -
Xmx820m -Djava.io.tmpdir=/data3/hadoop/dfs/nm-local-dir/usercache/hadoop/appcache/application15029392697534159/container1502939269753415901000009/tmp -Dlog4j.con
实际值108650496 大于1073741824(1G),即被监视到map内存会溢出而被AppMaster kill掉了。
综上,涉及到参数有:yarn.nodemanager.vmem-pmem-ratio、mapreduce.map.memory.mb
在mapred-site.xml增加mapreduce.map.memory.mb、mapreduce.reduce.memory.mb等参数
<property> <name>mapreduce.map.memory.mb</name> <value>2048</value> </property> <property> <name>mapreduce.map.java.opts</name> <value>-Xmx1638M -XX:+UseG1GC -XX:MaxGCPauseMillis=2000 -XX:InitiatingHeapOccupancyPercent=40 -XX:ConcGCThreads=4</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>4096</value> </property> <property> <name>mapreduce.reduce.java.opts</name> <value>-Xmx3276M -XX:+UseG1GC -XX:MaxGCPauseMillis=2000 -XX:InitiatingHeapOccupancyPercent=40 -XX:ConcGCThreads=4</value> </property> <property> <name>mapreduce.map.speculative</name> <value>true</value> </property> <property> <name>mapreduce.reduce.speculative</name> <value>true</value> </property>
在yarn-site.xml增大yarn.nodemanager.vmem-pmem-ratio (默认2.1)
<property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>10</value> </property>
根本原因还是device_app列簇中数据量太大
上一篇:Hbase数据模型及表结构设计