1、matlab及多元统计分析Matlab与多元统计分析胡云峰师学院第三章习题3.1对*地区的6名2周岁男婴的身高、胸围、上半臂进行测量。得样本数据如表3.1所示。假设男婴的测量数据*(a)(a=1,6)来自正态总体N3( ,) 的随机样本。根据以往的资料,该地区城市2周岁男婴的这三项的均值向量 0=(90,58,16),试检验该地区农村男婴与城市男婴是否有相同的均值向量。表3.1 *地区农村2周岁男婴的体格测量数据男婴身高(*1)cm胸围身高(*2)cm上半臂围身高(*3)cm17860.616.527658.112.539263.214.5481591458160.815.568459.514
2、解1预备知识 未知时均值向量的检验:H0: = 0H1: 0H0成立时 当或者拒绝当或者接受这里2根据预备知识用matlab实现本例题算样本协方差和均值程序*=78 60.6 16.5;76 58.1 12.5;92 63.2 14.5;81 59.0 14.0;81 60.8 15.5;84 59.5 14.0;n,p=size(*);i=1:1:n;*junzhi=(1/n)*sum(*(i,:);y=rand(p,n);for j=1:1:ny(:,j)= *(j,:)-*junzhi;y=y;endA=zeros(p,p);for k=1:1:n;A=A+(y(:,k)*y(:,k);
3、end*junzhi=*junzhiS=(n-1)(-1)*A输出结果*junzhi = 82.0000 60.2000 14.5000S = 31.6000 8.0400 0.5000 8.0400 3.1720 1.3100 0.5000 1.3100 1.900然后u=90;58;16;t2=n*(*junzhi-u)*(S(-1)*(*junzhi-u)f=(n-p)/(p*(n-1)*t2输出结果t2 = 420.4447f = 84.0889所以=420.4447=84.0889查表得F3,3(0.05)=9.2884.0889 F3,3(0.01)=29.51.4982 F0.0
4、1(3,11)=6.221.4982因此在a=0.05或 a=0.01时接受假设第四章习题4.1 下表列举*年级任取12名学生的5门主课的期末考试成绩,试绘制学生序号为1、2、11、12的轮廓图、雷达图。表4.1 学生学习成绩序号政治语文外语数学物理19994931001002998896999731009881961004938888999651009172967869078827597775738897898938483688898773607684109582906239117672436778128575503437解 我们只需要数据如下199949310010029988969997
5、1176724367781285755034371 利用matlab画轮廓图程序*=1:5;y1=99 94 93 100 100;y2=99 88 96 99 97;y3=76 72 43 67 78;y4=85 75 50 34 37;plot(*,y1,k-o,linewidth,1);hold on;plot(*,y2,r-*,linewidth,2);hold on;plot(*,y3,b-.p,linewidth,2);hold onplot(*,y4,k-o,linewidth,2);*label(学科);ylabel(分数);legend(1,2,11,12);set(gca,
6、*tick,1 2 3 4 5)set(gca,*ticklabel,政治,语文,外语,数学,物理)输出结果2 利用matlab画雷达图此图用matlab画起来比较复杂首先我们修改polar函数在命令窗口输入edit polar 结果会出现polar函数的程序其中我们把% plot spokes th = (1:6)*2*pi/12; cst = cos(th); snt = sin(th); cs = -cst; cst; sn = -snt; snt; line(rma*cs,rma*sn,linestyle,ls,color,tc,linewidth,1,. handlevisibili
7、ty,off,parent,ca*)修改为% plot spokes th = (1:3)*2*pi/6; cst = cos(th); snt = sin(th); cs = -cst; cst; sn = -snt; snt; line(rma*cs,rma*sn,linestyle,ls,color,tc,linewidth,1,. handlevisibility,off,parent,ca*)再将后面的所有程序中的30改为72然后另存为work中并命名为mypolar.m然后输入程序*=0:pi/2.5:2*pi;y1=99 94 93 100 100 99;y2=99 88 96
8、99 97 99;y3=76 72 43 67 78 76;y4=85 75 50 34 37 85;mypolar(*,y1,b);hold on;mypolar(*,y2,m);hold on;mypolar(*,y3,g);hold on;mypolar(*,y4,y)legend(1,2,11,12);输出结果第五章聚类分析习题5.3.下表给出我国历年职工人数(单位:万人),请用有序样品的fisher法聚类。年份全民所有制集体所有制195215802319541881121195624235541958453266219605044925196233031012196434651136
9、196639391264196841701334197047921424197256101524197460071644197668601813197874512048198080192425解 第一步数据标准化后计算直径D程序:*=1580 23;1881 121;2423 554;4532 662;5044 925;3303 1012;3465 1136;. 3939 1264;4170 1334;4792 1424;5610 1524;6007 1644;6860 1813;. 7451 2048;8019 2425;stdr=std(*);n,m=size(*);*=*./stdr(o
10、nes(n,1),:); n p=size(*);D=zeros(n,n);for i=1:1:n; for j=1:1:n; if ij t=i:1:j; *gjunzhi=(1/(j-i+1)*sum(*(t,:); y=zeros(1,j-i+1); for s=i:1:j y(s)=(*(s,:)-*gjunzhi)*(*(s,:)-*gjunzhi); end s=i:1:j; D(i,j)=sum(y); else D(i,j)=0; end endendD=D输出结果矩阵太大,所以用e*cel处理了一下D=0000000000000000.02256700000000000000
11、0.448980.2457800000000000002.06321.39810.600240000000000003.92562.6511.18020.11098000000000004.50223.00911.42380.569530.4086200000000005.1793.43531.66480.825760.538310.020440000000006.08234.0211.9761.0230.633430.127810.047757000000007.03114.65022.32551.23130.7550.263410.112750.01245600000008.33225.5
12、7622.90941.60451.05310.606190.338810.131220.06003200000010.3127.10344.01172.41261.77721.37930.923140.526640.315410.0994010000012.6968.99725.44223.51142.75482.35531.6691.04570.654960.256320.03671000016.29111.9987.86885.50384.56864.11933.10322.14681.47070.771220.308580.1276200021.11716.12811.3218.4298
13、7.23166.64875.21163.83122.77931.68770.88810.460160.10709002822.16716.52812.97811.38610.5468.55966.6275.07163.45392.17481.34430.598320.199510我们只看下三角所有元素,其它元素理解为空第二步我们计算损失函数矩阵L程序:%设计一个把样品分为两类的程序,以及对应最后一类分割点D=D;L=zeros(n-1,n-1);alp=zeros(n-1,n-1);for m=2:n; s=zeros(1,m-1); for j=2:m s(1,j-1)=D(1,j-1)+D
14、(j,m); end L(m-1,1)=min(s(1,1:m-1); for j=1:m-1 if L(m-1,1)=s(1,j); alp(m-1,1)=j+1; end endend%分为k类for k=3:n; for m=k:n s=zeros(1,m-k+1); for j=k:m; s(1,j-k+1)=L(j-2,k-2)+D(j,m); end L(m-1,k-1)=min(s(1,1:m-k+1); for j=1:m-k+1 if L(m-1,k-1)=s(1,j); alp(m-1,k-1)=j+k-1; end end endend输出结果 这里由于表太大,用e*ce
15、l处理一下L=000000000000000.02256700000000000000.448980.0225670000000000000.559960.133550.022567000000000001.01850.559960.133550.02256700000000001.27470.58040.153990.0430070.020440000000001.4720.687770.261360.150380.0430070.02044000000001.68030.823370.396960.166440.0554640.0328970.01245600000002.05351.16
16、620.711620.285210.166440.0554640.0328970.0124560000002.86161.77970.922770.496360.265840.154860.0554640.0328970.012456000003.96041.93661.07970.653280.321920.203150.0921740.0554640.0328970.0124600005.95282.36211.47471.02020.593790.321920.203150.0921740.0554640.03290.0124560008.71882.94162.04371.18680.
17、760370.429010.310240.199270.0921740.055460.0328970.01245600alp=200000000000003300000000000044400000000000445500000000004666600000000046666700000000466688800000004668888900000046881010101010000004101010101011111111000041010101111111212121200041111111113131313131313001011131313131313141414141401012131
18、415151515151515151515在这里解释一下这两个矩阵行表示分为k类,k从2到15;列表示样本数m,m从2到15我们只看下三角所有元素,其它元素理解为空,接下来我们根据结果分析如果我们要把样品分为三类,则第一个分割点为11,然后第二个分割点为6得到 第一类:1952,1954,1956,1958,1960第二类:1962,1964,1966,1968,1970第三类:1972,1974,1976,1978,1980第六章判别分析例6.6 对全国30个省市自治区1994年影响各地区经济增长差异的制度变量*1经济增长率,*2非国有化水平,*3开放度,*4市场化程度作贝叶斯判别分析。类别
19、序号地区*1*2*3*4第一组111.257.2513.4773.41214.967.197.8973.093*14.364.7419.4172.33413.555.6320.5977.33516.275.5111.0672.08614.357.6322.5177.3572083.415.9989.5821.868.0339.4271.991978.3183.0380.7510*1657.1112.5760.911111.949.9730.769.2第二组128.730.7215.4160.251314.337.6512.9566.421410.134.637.6862.96159.156.3
20、310.366.011613.865.234.6964.241715.355.626.0654.74181155.558.0267.47191862.856.458.832010.430.014.6160.26218.229.286.1150.712211.462.885.3161.492311.628.579.0868.47248430.236.0355.55258.215.968.0440.2626*10.924.758.3446.012715.621.4428.6246.01待判样品2816.580.058.8173.042920.681.245.3760.43308.642.068.8856.37